Register for the 5th NLP Summit, a Free Online Conference on Sep 24-26. Register now.
was successfully added to your cart.
Watch the webinar

Next-Gen Table Extraction from Visual Documents: Leveraging Multimodal AI

Explore the latest advancements in multimodal AI for extracting tabular data from visual documents. This session will delve into novel methods implemented in John Snow Labs’ Visual NLP library, which has significantly improved the accuracy of information extraction and question answering from tables in PDFs and image files.

The webinar will cover a range of practical applications, demonstrating how this technology is adept at handling complex documents such as financial disclosures, clinical trial results, insurance rates, lab scores, and academic research. The focus will be zero-shot models, where the AI model directly interprets and responds to queries from source images, eliminating the need for specialized training or tuning.
We’ll also cover Visual NLP capabilities that have been specifically designed to enhance table extraction quality, especially in challenging cases like multi-line cells or borderless tables. We’ll discuss the technical underpinnings of this feature, including the integration of computer vision and object character recognition for detecting tables and individual cells within them. We’ll touch upon how that extends to support for tables with custom borders, dark & noisy backgrounds, uncommon table layouts, multilingual text, and international number & currency formats.

This webinar is ideal for professionals and researchers who face the challenge of converting complex visual data into actionable insights. Attendees will leave with a deeper understanding of how these cutting-edge AI models can be applied in various fields to improve data accessibility and analysis efficiency.

About the speaker

Alberto Andreotti
Senior Data Scientist at John Snow Labs

Alberto Andreotti is a data scientist at John Snow Labs, specializing in Machine Learning, Natural Language Processing, and Distributed Computing. With a background in Computer Engineering, he has expertise in developing software for both Embedded Systems and Distributed Applications. Alberto is skilled in Java and C++ programming, particularly for mobile platforms. His focus includes Machine Learning, High-Performance Computing (HPC), and Distributed Systems, making him a pivotal member of the John Snow Labs team.