Skip to main content
was successfully added to your cart.

Spark OCR Blog

State-of-the-Art Visual Document Understanding and Form Understanding.

Visual Document Understanding using John Snow Lab’s Spark-OCR

Automatic Data Insight extraction from documents is a challenging task, and John Snow Labs Spark-OCR is using several different NLP, Computer Vision, and Spark techniques to enhance the user experience,...

NLP for Finance – Automated Invoice Classification for Submission Compliance

Our Finance department receives thousands of invoices every month which needs to be categorized manually prior to submission to the bank. This is a manual process that is extremely labor-intensive...

Stepping Up Information Extraction Capabilities for Virginia Tech with Spark OCR

John Snow Labs is well known for helping healthcare and life science companies build, deploy, and operate AI products and services with its Spark NLP, one of the most widely...

Accurate Table Extraction from Documents & Images with Spark OCR

Extracting data formatted as a table (tabular data) is a common task — whether you’re analyzing financial statements, academic research papers, or clinical trial documentation. Table-based information varies heavily in...

New Spark OCR 3.12: Handwritten Text Recognition and Spark 3.2 support

This release comes with new models for Handwritten Text Recognition, Spark 3.2 support, bug fixes, and notebook examples. Added to the ImageTextDetectorV2 New parameter 'mergeIntersects': merge bounding boxes corresponding to...