Sign up for the Healthcare NLP Summit 2024, April 2-3. Register now.
was successfully added to your cart.

Keyword-based search for text, images, and PDFs in the NLP Lab 

Searching for specific information within long text or PDF documents, or within images is important because it allows users to quickly and easily locate the information they need without having to manually scroll through the entire document. This can save time and make it more efficient for users to find the information they need. Additionally, PDFs often contain large amounts of information and can be difficult to navigate, so search functionality can help users easily find the information they are looking for within the document. If the goal is to extract data from PDF, Visual NLP tool is suitable.

Task Search by Text, Label, and Choice

NLP Labs offer advanced search features that help users identify the tasks they need based on the text or based on the annotations defined so far. Currently, supported search queries are:

  • text: patient -> returns all tasks which contain the string “patient”;
  • label: ABC -> returns all tasks that have at least one completion containing a chunk with label ABC;
  • label: ABC=DEF -> returns all tasks that have at least one completion containing the text DEF labeled as ABC;
  • choice: Sport -> returns all tasks that have at least one completion which classified the task as Sport;
  • choice: Sport, Politics -> returns all tasks that have at least one completion containing multiple choices Sport and Politics.

Search functionality is case insensitive, thus the following queries label: ABC=DEF , label: Abc=Def or label: abc=def are considered equivalent.

Keyword-based Search at Task Level

NLP Lab supports task-level keyword-based searches. The keyword-based search feature works for text and Visual NER projects alike.

  • The search will work on all paginated pages.
  • It is also possible to navigate between search results, even if that result is located on another page.

Important

In the NLP Annotation Lab, the search feature was implemented with the help of an HTML tag, added to the Visual NER project configuration. In the NLP Lab, with the implementation of task-level search feature, the previous search tag should be removed from existing visual NER projects.

Config to be removed from all existing Visual NER projects:

<Search name="search" toName="image" placeholder="Search"/>

text-searchKeyword-based search in text tasks.

vOCR-search

Keyword-based search in PDF/image tasks.

Chunk-based Search in Visual NER Tasks

In previous versions, users could only run token-based searches at page level. The search feature did not support searching a collection of tokens as a single chunk. With this release, users can find a chunk of tokens in the Visual NER task.

chunk-search

Getting Started is Easy

The NLP Lab is a free tool that can be deployed in a couple of clicks on the AWS and Azure Marketplaces, or installed on-premise with a one-line Kubernetes script. Get started here: https://nlp.johnsnowlabs.com/docs/en/alab/install

Get Started with NLP Lab

Prompts Engineering in the NLP Lab

NLP Lab comes with support for zero-shot learning via prompts. Prompt engineering is a very recent but rapidly growing discipline that aims...
preloader