Sign up for the Healthcare NLP Summit 2024, April 2-3. Register now.
was successfully added to your cart.

Docusign

“A Unified CV, OCR, and NLP approach for scalable document understanding”

Read the full case study

INDUSTRY: Finance

Introduction: “DocuSign has been on a mission to accelerate business and simplify life for companies and people around the world. The company pioneered the development of e-signature technology, and today DocuSign helps organizations connect and automate how they prepare, sign, act on, and manage agreements.

DocuSign team was looking to automate extraction of structured information from document images.
Examples:

  • Contracts
  • Tax forms
  • Passport applications
  • Invoices
  • etc.”

Challenge: “The team faced 3 main challenges:

  • High and growing variation in layout
  • Unbounded field type complexity
  • Unstructured information

CV’s pose their own unique challenges:

  • Size variation: objects may be very small, very large or somewhere in between; can be densely packed or relatively sparse dimensionality; can have arbitrary aspect ratios
  • Context: Objects can exhibit both long and short contextual dependencies
  • Density”

Solution: “DocuSign partnered with John Snow Labs to leverage it’s award-winning Spark NLP & OCR.

Humans create documents in whatever format best suits their immediate needs. Therefore, rules-based engines (template based, position based) will not scale. The ideal solution is to learn high level representations from data using AI. This is when Spark NLP & OCR steps in.”

preloader