State of the Art Natural Language Processing

John Snow Labs’ Spark NLP is an open source text processing library for Python, Java, and Scala.

It provides production-grade, scalable, and trainable versions of the latest research in
natural language processing.

Unmatched Speed Scale
Unmatched
Speed & Scale

Spark NLP was 80x faster than spaCy to train locally on 2.6MB of data.

Scale to a Spark cluster with zero code changes.

Read more

Art Accuracy
State of the Art
Accuracy

First production-grade versions of novel deep learning NLP research.

Use pre-trained models to train to fit your data.

Read more

Enterprise
Most Widely Used in
the Enterprise

Widely deployed production-grade codebase.

New releases every 2 weeks since 2017.

Growing community.

Read more

The most widely used NLP library in the Enterprise, by far

NLP library
NLP library

Why Spark NLP?

Accuracy chart

Accuracy

Spark NLP delivered the best performing results on academic peer-reviewed benchmarks.

Scalability

Zero code changes are needed to scale a pipeline to any spark cluster.

Scalability chart
Speed chart

Speed

Optimized builds for the latest Intel & Nvidia chips enable the fastest training & tuning of state-of-the-art models.

Out Of The Box Functionality

Entity Recognition
John Snow Labs
Algorithms
Split Text
  • Sentence Detector
  • Deep Sentence Detector
  • Tokenizer
  • nGram Generator
Understand Grammar
  • Stemmer
  • Lemmatizer
  • Part of Speech Tagger
  • Dependency Parser
Information Extraction
John Snow Labs
Algorithms
Clean Text
  • Spell Checking
  • Spell Correction
  • Normalizer
  • Stopword Cleaner
Find in Text
  • Text Matcher
  • Regex Matcher
  • Date Matcher
  • Chunker
Sentiment Analysis
Open Source Ai Platform
Content
Transformers
GloVeELMOBERTALBERTXLNetUSESmall BERTELECTRABioBERTLaBSE
Pre-trained Models
250+
Pretrained
Information Extraction
Open Source Ai Platform
Content
46 Languages
AI Platform Architecture
Pre-trained Pipelines
90+
Pretrained
Functionality
Trainable & Tunable
John Snow Labs
Scalable to a Cluster
John Snow Labs
Fast Inference
John Snow Labs
Hardware Optimized
John Snow Labs
John Snow Labs
Community
John Snow Labs

Trainable to understand your language

Spark NLP is optimized for training domain-specific NLP models, so you can adapt it to learn the nuances of jargon and documents you must support.

Trainable chart
Curated Health Datasets

Introducing Spark NLP at Top Level AI Conferences

Strata

Spark NLP: How Roche automates knowledge extraction from pathology and radiology reports

Read More

Strata

Spark NLP in action: Intelligent, high-accuracy fact extraction from long financial documents

Read More

Strata

Spark NLP in action: How SelectData uses AI to better understand home health patients

Read More