Skip to main content
was successfully added to your cart.

Visual NLP

Understand Visual Documents with High-Accuracy OCR, Form Summarization, Table Extraction, PDF Parsing, and more

By all accounts, John Snow Labs has created the most accurate software in history to extract facts from unstructured text.

Healthcare Tech Outlook

Proven Customer Success

NLP for Finance – Automated Invoice Classification for Submission Compliance

Learn more

Interpreting millions of patient stories with deep learned OCR and NLP

Learn more

A unified CV, OCR, and NLP approach for scalable document understanding at DocuSign

Learn more

What’s in the box

Trainable & Tunable
Healthcare Data Market
Scalable to a Cluster
Healthcare Data Market
Fast Inference
Healthcare Data Market
Hardware Optimized
Healthcare Data Market
Healthcare Data Market
Community
Healthcare Data Market

Visual NLP in Action

De-identify Images, PDF, and DICOM files

Combine computer vision, OCR, and NLP models to classify documents, extract normalized entities and figures, find signatures on forms, extract data from tables, and de-identify images.

AI in Medical Field
Extract data from images & forms

Extract and normalize specific facts & figures from custom images and forms, by training your own models to learn where in the image, next to which words, and using what formatting the facts you’re interested in are.

AI in Medical Field
Extract whole tables

Find tables in images, visually identify rows and columns, and extract data from cells into data frames. Turn scans from financial disclosures, academic papers, lab results and more into usable data.

AI in Medical Field
Recognize entities in scanned PDFs

End-to-end example of regular NER pipeline: import scanned images from cloud storage, preprocess them for improving their quality, recognize text using Spark OCR, correct the spelling mistakes for improving OCR results and finally run NER for extracting entities.

AI in Medical Field
AI & NLP in Healthcare
AI & NLP in Healthcare
Correct skewness in scanned documents

Correct the skewness of your scanned documents will highly improve the results of the OCR. Spark OCR is the only library that allows you to finetune the image preprocessing for excellent OCR results.

AI in Medical Field
Recognize text in natural scenes

By using image segmentation and preprocessing techniques Spark OCR recognizes and extracts text from natural scenes.

AI in Medical Field
AI in Healthcare
Remove background noise from scanned documents

Removing the background noise in a scanned document will highly improve the results of the OCR. Spark OCR is the only library that allows you to finetune the image preprocessing for excellent OCR results.

AI in Medical Field
DICOM to Text

Recognize text from DICOM format documents. This feature explores both the text on the image and the text from the metadata file.

AI in Medical Field
Advanced Analytics And AI
Extract Signatures & Dates from Signed Forms

Detect signatures in image-based documents.

AI in Medical Field