Spark NLP for Healthcare

Healthcare Datasets
Spark NLP For Healthcare
Life Sciences Data Catalog

John Snow Labs is emerging as the clear industry leader for state-of-the-art NLP in healthcare.

We cannot recommend a better way to apply the most current, accurate, and scalable technology to your natural language understanding challenges today.

Editor-in-Chief, The Technology Headlines, Sep 2019

The most widely used NLP library in Healthcare, by far

NLP Application Case
NLP Application Case

What’s in the Box

Clinical Entity Recognition
40 units
DOSAGE
of
insulin glargine
drug
at night
FREQUENCY
Algorithms
Extract Knowledge
  • Entity Linker
  • Entity Disambiguator
  • Document Classifier
  • Contextual Parser
Split Text
  • Sentence Detector
  • Deep Sentence Detector
  • Tokenizer
  • nGram Generator
Clinical Grammar
  • Stemmer
  • Lemmatizer
  • Part of Speech Tagger
  • Dependency Parser
Clinical Entity Linking
Suspect diabetes
SNOMED-CT:
473127005
Lisinopril 10 MG
RxNorm:
316151
Hyponatremia
ICD-10:
E87.1
Algorithms
De-Identity Text
  • Structured Data
  • Unstructured Text
  • Obfuscator
  • Generalizer
Clean Medical Text
  • Spell Checking
  • Spell Correction
  • Normalizer
  • Stopword Cleaner
Find in Text
  • Text Matcher
  • Regex Matcher
  • Date Matcher
  • Chunker
Assertion Status
Fever and sore throat
PRESENT
No stomach pain
ABSENT
Father with Alzheimer
FAMILY
Content
Medical Transformers
JSL-BERT-ClinicalBioBERTClinicalBERTGloVe-MedGloVe-ICD-O
Relation Extraction
Ora
NAME
a
25
AGE
yo
cashier
PROFESSION
from
Morocco
LOCATION
Healthcare AI Platform
Content
Linked Medical Terminologies
SNOMED-CTCPTICD-10-CMRxNormICD-10-PCSICD-OLOINC

50+ Pretrained Models

Clinical:

Signs, Symptoms, Treatments, Procedures, Tests, Labs, Sections

Anatomy:

Organ, Subdivision, Cell, Structure Organism, Tissue, Gene, Chemical

Drugs:

Name, Dosage, Strength, Route, Duration, Frequency

Demographics:

Age, Gender, Height, Weight, Race, Ethnisity, Marital Status, Vital Signs

Risk Factors:

Smoking, Obesity, Diabetes, Hypertension, Substance Abuse

Sensitive Data:

Patient Name, Address, Phone, Email, Dates, Providers, Identifiers

Trainable & Tunable
Core Healthcare Datasets
Scalable to a Cluster
Healthcare Data
Fast Inference
Fast Inference
Hardware Optimized
Hardware Optimized
Community
Community
Community

Spark NLP for Healthcare in action

Clinical Entity Recognition
NLP Application Case
Clinical Entity Recognition
ClinicalEntity Linking
Clinical Entity Linking
Explore terminology Catalog
Assertion Status
Explore terminology Catalog
Assertion Status
Assertion Status
Relation Extraction
De-Identification

State Of The Art Accuracy

Production-Grade, Fast & Trainable Implementation of State-of-the-Art Biomedical NLP Research
“BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”, Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova, CoRR, 2018
“Entity Recognition from Clinical Texts via Recurrent Neural Network”, Liu et al., BMC Medical Informatics & Decision Making, July 2017.
“CNN-based ranking for biomedical entity normalization”, Li et al., BMC Bioinformatics, October 2017.
“Neural Networks For Negation Scope Detection“, Fancellu et al., In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016.
“How to Train Good Word Embeddings for Biomedical NLP, Billy Chiu, Gamal Crichton“, Anna Korhonen, Sampo Pyysalo, Proceedings of the 15th Workshop on Biomedical Natural Language Processing, 2016
State Accuracy

Proven success across healthcare