Spark NLP in Action

Open Source

Healthcare Data
NLP In Healthcare
Recognize entities in text
Recognize Persons, Locations, Organizations and Misc entities using out of the box pretrained Deep Learning models based on GloVe (glove_100d) and BERT (ner_dl_bert) word embeddings.
Frictionless Data Center
AI & NLP Awards
Recognize more entities in text
Recognize over 18 entities such as Countries, People, Organizations, Products, Events, etc. using an out of the box pretrained NerDLApproach trained on the OntoNotes corpus.
Healthcare AI Platform
Curated Health Datasets
Classify documents
Classify open-domain, fact-based questions into one of the following broad semantic categories: Abbreviation, Description, Entities, Human Beings, Locations or Numeric Values.
Core Healthcare Datasets
NLP Application Case
Analyze sentiment in movie reviews and tweets
Detect the general sentiment expressed in a movie review or tweet by using our pretrained Spark NLP DL classifier.
AI in medical research
AI in Healthcare
Detect emotions in tweets
Automatically identify Joy, Surprise, Fear, Sadness in Tweets using out pretrained Spark NLP DL classifier.
Detect cyberbullying in tweets
Identify Racism, Sexism or Neutral tweets using our pretrained emotions detector.
AI platform in Healthcare
Spark NLP
Detect sarcastic tweets
Checkout our sarcasm detection pretrained Spark NLP model. It is able to tell apart normal content from sarcastic content.
John Snow Labs
Healthcare Data
Detect toxic comments
Classify comments and tweets into Toxic, Insults, Hate, Obscene, Threat.
Healthcare Data
Open Source Ai Platform
Identify Fake news
Determine if news articles are Real of Fake.
Enriched Healthcare Data
Free Health Datasets
Detect Spam messages
Automatically identify messages as being regular messages or Spam.
NLP Application Case
Healthcare AI Platform
Find a text in document
Finds a text in document either by keyword or by regex expression.
Healthcare AI Platform
Healthcare Datasets
Grammar analysis & Dependency Parsing
Visualize the syntactic structure of a sentence as a directed labeled graph where nodes are labeled with the part of speech tags and arrows contain the dependency tags.
Life Science Datasets
Award Winning AI Platform
Split and clean text
Spark NLP pretrained annotators allow an easy and straightforward processing of any type of text documents. This demo showcases our Sentence Detector, Tokenizer, Stemmer, Lemmatizer, Normalizer and Stop Words Removal.
spelling
Frictionless Data Center
Spell check your text documents
Spark NLP contextual spellchecker allows the quick identification of typos or spell issues within any text document.
Detect Key Phrases
Detect Key Phrases
Detect Key Phrases
Automatically detect key phrases in your text documents using out-of-the-box Spark NLP models.
Detect similar sentences
Detect similar sentences
Detect similar sentences
Automatically compute the similarity between two sentences using Spark NLP Universal Sentence Embeddings.
Detect Toxic Comments
Detect Toxic Comments
Detect toxic content in comments
Automatically detect identity hate, insult, obscene, severe toxic, threat or toxic content in SM comments using our out-of-the-box Spark NLP Multiclassifier DL.
analysis for restaurants
Life Science Datasets
Aspect based sentiment analysis for restaurants
Automatically detect positive, negative and neutral aspects about restaurants from the written feedback given by reviewers.
Detect sentences in text
Award Winning AI Platform
Detect sentences in text
Detect sentences from general purpose text documents using a deep learning model capable of understanding noisy sentence structures.
Normalize dates
Terminology Data Catalog
Detect and normalize dates
Automatically detect key phrases expressing dates and normalize them with respect to a reference date.

Languages

NLP In Healthcare
Life Sciences Data Catalog
Detect language
Spark NLP Language Detector offers support for 20 different languages: Bulgarian, Czech, German, Greek, English, Spanish, Finnish, French, Croatian, Hungarian, Italy, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Swedish, Turkish, and Ukrainian
United Kingdom
Recognize entities in English text
Recognize Persons, Locations, Organizations and Misc entities using out of the box pretrained Deep Learning models based on GloVe (glove_100d) and BERT (ner_dl_bert) word embeddings.
French flag
Recognize entities in French text
Recognize Persons, Locations, Organizations and Misc entities using an out of the box pretrained Deep Learning model and GloVe word embeddings (glove_100d).
German flag
Recognize entities in German text
Recognize Persons, Locations, Organizations and Misc entities using an out of the box pretrained Deep Learning model and GloVe word embeddings (glove_300d).
Italian flag
Recognize entities in Italian text
Recognize Persons, Locations, Organizations and Misc entities using an out of the box pretrained Deep Learning model and GloVe word embeddings (glove_300d).
Norway flag
Recognize entities in Norwegian text
Recognize Persons, Locations, Organizations and Misc entities using 3 different out of the box pretrained Deep Learning models based on different GloVe word embeddings (glove_100d & glove_300d).
Poland flag
Recognize entities in Polish text
Recognize Persons, Locations, Organizations and Misc entities using 3 different out of the box pretrained Deep Learning models based on different GloVe word embeddings (glove_100d & glove_300d).
Flag
Recognize entities in Portuguese text
Recognize Persons, Locations, Organizations and Misc entities using 3 different out of the box pretrained Deep Learning models based on different GloVe word embeddings (glove_100d & glove_300d).
Russia flag
Recognize entities in Russian text
Recognize Persons, Locations, Organizations and Misc entities using 3 different out of the box pretrained Deep Learning models based on different GloVe word embeddings (glove_100d & glove_300d).
Spanish flag
Recognize entities in Spanish text
Recognize Persons, Locations, Organizations and Misc entities using 3 different out of the box pretrained Deep Learning models based on different GloVe word embeddings (glove_100d & glove_300d).
Flag of Denmark
Recognize entities in Danish text
Recognize Persons, Locations, Organizations and Misc entities using an out of the box pretrained Deep Learning model and GloVe word embeddings (glove_300d).
Flag of Sweden
Recognize entities in Swedish text
Recognize Persons, Locations, Organizations and Misc entities using an out of the box pretrained Deep Learning model and GloVe word embeddings (glove_300d).
Flag of Finland
Recognize entities in Finnish text
Recognize Persons, Locations, Organizations and Misc entities using an out of the box pretrained Deep Learning model and GloVe word embeddings (glove_300d).
Prebuilt pipeline
Frictionless Data Center
Prebuilt pipeline for entity recognition in Danish
This SparkNLP out-of-the-box pipeline returns tokens, lemmas, pos, embeddings and NERs in one line of code. It automatically recognizes Persons, Locations, Organizations and Misc entities in Danish text.
Prebuilt pipeline Swedish
NLP In Healthcare
Prebuilt pipeline for entity recognition in Swedish
This SparkNLP out-of-the-box pipeline returns tokens, lemmas, pos, embeddings and NERs in one line of code. It automatically recognizes Persons, Locations, Organizations and Misc entities in Swedish text.
Prebuilt pipeline Finnish
Free Health Datasets
Prebuilt pipeline for entity recognition in Finnish
This SparkNLP out-of-the-box pipeline returns tokens, lemmas, pos, embeddings and NERs in one line of code. It automatically recognizes Persons, Locations, Organizations and Misc entities in Finnish text.
Recognize entities in Turkish text
Recognize Persons, Locations and Organization entities using an out of the box pretrained Deep Learning model and multi-lingual Bert word embeddings.

Healthcare

Detect signs and symptoms
AI & NLP Awards
Detect signs and symptoms
Automatically identify Signs and Symptoms in clinical documents using two of our pretrained Spark NLP clinical models.
Diagnosis and procedures
Free Health Datasets
Detect diagnosis and procedures
Automatically identify diagnoses and procedures in clinical documents using the pretrained Spark NLP clinical model ner_clinical.
Detect drugs and prescriptions
Healthcare Datasets Terminology
Detect drugs and prescriptions
Automatically identify Drug, Dosage, Duration, Form, Frequency, Route, and Strength details in clinical documents using three of our pretrained Spark NLP clinical models.
Detect risk factors
Detect risk factors
Detect risk factors
Automatically identify risk factors such as Coronary artery disease, Diabetes, Family history, Hyperlipidemia, Hypertension, Medications, Obesity, PHI, Smoking habits in clinical documents using our pretrained Spark NLP model.
Detect anatomical references
Anatomical references
Detect anatomical references
Automatically identify Anatomical System, Cell, Cellular Component, Anatomical Structure, Immaterial Anatomical Entity, Multi-tissue Structure, Organ, Organism Subdivision, Organism Substance, Pathological Formation in clinical documents using our pretrained Spark NLP model.
Detect demographic information
Demographic information
Detect demographic information
Automatically identify demographic information such as Date, Doctor, Hospital, ID number, Medical record, Patient, Age, Profession, Organization, State, City, Country, Street, Username, Zip code, Phone number in clinical documents using three of our pretrained Spark NLP models.
Detect clinical events
Clinical events
Detect clinical events
Automatically identify a variety of clinical events such as Problems, Tests, Treatments, Admissions or Discharges, in clinical documents using two of our pretrained Spark NLP models.
Detect lab results
Lab results
Detect lab results
Automatically identify Lab test names and Lab results from clinical documents using our pretrained Spark NLP model.
Detect tumor characteristics
Detect tumor characteristics
Detect tumor characteristics
Automatically identify tumor characteristics such as Anatomical systems, Cancer, Cells, Cellular components, Genes and gene products, Multi-tissue structures, Organs, Organisms, Organism subdivisions, Simple chemicals, Tissues from clinical documents using our pretrained Spark NLP model.
Detect clinical events
Clinical events
Spell checking for clinical documents
Automatically identify from clinical documents using our pretrained Spark NLP model ner_bionlp.
Detect posology relations
Posology relations
Detect posology relations
Automatically identify relations between drugs, dosage, duration, frequency and strength using our pretrained clinical Relation Extraction (RE) model.
Detect causality between symptoms
Symptoms
Detect causality between symptoms and treatment
Automatically identify relations between symptoms and treatment using our pretrained clinical Relation Extraction (RE) model.
Detect temporal relations
Clinical events
Detect temporal relations for clinical events
Automatically identify three types of relations between clinical events: After, Before and Overlap using our pretrained clinical Relation Extraction (RE) model.
Detect sign
Detect signs
SNOMED coding
Automatically resolve the SNOMED code corresponding to the diseases and conditions mentioned in your health record using Spark NLP for Healthcare out of the box.
Diagnosis and procedures
Diagnosis and procedures
ICDO coding
Automatically detect the tumor in your healthcare records and link it to the corresponding ICDO code using Spark NLP for Healthcare out of the box.
Risk factors
Detect risk factors
ICD10-CM coding
Automatically detect the pre and post op diagnosis, signs and symptoms or other findings in your healthcare records and automatically link them to the corresponding ICD10-CM code using Spark NLP for Healthcare out of the box.
prescriptions
prescriptions
RxNORM coding
Automatically detect the drugs and treatments names mentioned in your prescription or healthcare records and link them to the corresponding RxNORM codes using Spark NLP for Healthcare out of the box.
Demographics and vital signs
Demographics and vital signs
Detect demographics and vital signs using rules
Automatically detect demographic information as well as vital signs using our out-of-the-box Spark NLP Contextual Rules. Custom rules are very easy to define and run on your own data.
Chemical compounds and genes
Chemical compounds and genes
Detect chemical compounds and genes
Automatically detect all chemical compounds and gene mentions using our pretrained chemprot model included in Spark NLP for Healthcare.
Genes and human phenotypes
Genes and human phenotypes
Detect genes and human phenotypes
Automatically detect mentions of genes and human phenotypes (hp) in medical text using Spark NLP for Healthcare pretrained models.
Human Phenotypes
Human Phenotypes
Detect normalized genes and human phenotypes
Automatically detect normalized mentions of genes (go) and human phenotypes (hp) in medical text using Spark NLP for Healthcare pretrained models.
Healthcare AI Platform
Healthcare AI Platform
ICD10 coding for German
Automatically detect the pre and post op diagnosis, signs and symptoms in your German healthcare records and automatically link them to the corresponding ICD10-CM code using Spark NLP for Healthcare out of the box.
Detect causality
Detect causality
Detect symptoms, treatments and other NERs in German
Automatically identify entities such as symptoms, diagnoses, procedures, body parts or medication in German clinical text using the pretrained Spark NLP clinical model ner_healthcare.
Grammar Analysis
Grammar Analysis
Detect legal entities German
Automatically identify entities such as persons, judges, lawyers, countries, cities, landscapes, organizations, courts, trademark laws, contracts, etc. in German legal text using the pretrained Spark NLP models ner_legal.
Adverse drug events
Adverse drug events
Adverse drug events tagger
Automatic pipeline that tags documents as containing or not containing adverse events description, then identifies those events.
Symptoms assertion
Diagnosis and symptoms
Identify diagnosis and symptoms assertion status
Automatically detect if a diagnosis or a symptom is present, absent, uncertain or associated to other persons (e.g. family members).
Cell structure DNA
RNA and protein
Detect cell structure, DNA, RNA and protein
Automatically detect cell type, cell line, DNA and RNA information using our pretrained Spark NLP for Healthcare model.
Wikipedia
Wikipedia
Link entities to Wikipedia pages
Automatically disambiguate people’s names based on their context and link them to corresponding Wikipedia pages using out of the box Spark NLP pretrained models.
Healthcare
Healthcare documents
Detect sentences in healthcare documents
Automatically detect sentences in noisy healthcare documents with our pretrained Sentence Splitter DL model.
Normalize dates
Terminology Data Catalog
Classify medical text according to PICO framework
Automatically classify medical text in PICO components: Participants/Problem, Intervention, Comparison, and Outcome.
Normalize dates
Terminology Data Catalog
Detect chemical compounds
Automatically detect all types of chemical compounds using our pretrained Spark NLP for Healthcare model.
Normalize dates
Terminology Data Catalog
Detect bacteria, plants, animals or general species
Automatically detect bacteria, plants, animals, and other species using our pretrained Spark NLP for Healthcare model.
Normalize dates
Terminology Data Catalog
Generate SQL queries from natural language
Automatically generate valid SQL queries from raw text using our unique DL generative model.
Normalize dates
Terminology Data Catalog
Detect traffic information in text
Automatically extract geographical location, postal codes, and traffic routes in German text using our pretrained Spark NLP model.

Spark OCR

John Snow Labs
John Snow Labs
PDF to Text
Extract text from generated/selectable PDF documents and keep the original structure of the document by using our out-of-the-box Spark OCR library.
AI and NLP Experts At Your Service
Spark NLP
DICOM to Text
Recognize text from DICOM format documents. This feature explores both to the text on the image and to the text from the metadata file.
John Snow Labs partners
AI Platform Architecture
Image to Text
Recognize text in images and scanned PDF documents by using our out-of-the-box Spark OCR library.
Life Sciences Data Catalog
Healthcare Datasets Terminology
Remove background noise from scanned documents
Removing the background noise in a scanned document will highly improve the results of the OCR. Spark OCR is the only library that allows you to finetune the image preprocessing for excellent OCR results.
AI and NLP Awards
NLP Application Case
Correct skewness in scanned documents
Correct the skewness of your scanned documents will highly improve the results of the OCR. Spark OCR is the only library that allows you to finetune the image preprocessing for excellent OCR results.
Frictionless Data Center
NLP In Healthcare
Recognize text in natural scenes
By using image segmentation and preprocessing techniques Spark OCR recognizes and extracts text from natural scenes.
Free Health Datasets
Healthcare Datasets
Recognize entities in scanned PDFs
End-to-end example of regular NER pipeline: import scanned images from cloud storage, preprocess them for improving their quality, recognize text using Spark OCR, correct the spelling mistakes for improving OCR results and finally run NER for extracting entities.
Life Science Datasets
AI in Healthcare
Extract tables from PDFs
Extract tables from selectable PDF documents with the new features offered by Spark OCR.

De-identification

Healthcare Datasets
Open Source Ai Platform
Deidentify structured data
Deidentify PHI information from structured datasets using out of the box Spark NLP functionality that enforces GDPR and HIPPA compliance, while maintaining linkage of clinical data across files.
Natural Language Processing Python
Healthcare Data
Deidentify free text documents
Deidentify free text documents by either masking or obfuscating PHI information using out of the box Spark NLP models that enforce GDPR and HIPPA compliance.
Artificial Intelligence In Service
Artificial Intelligence in Medical Diagnosis
Deidentify DICOM documents
Deidentify DICOM documents by masking PHI information on the image and by either masking or obfuscating PHI from the metadata.
AI Platform As A Service
AI in medical research
De-identify PDF documents - HIPAA Compliance
De-identify PDF documents using HIPAA guidelines by masking PHI information using out of the box Spark NLP models.
AI in Medical Field
Enriched Healthcare Data
De-identify PDF documents - GDPR Compliance
De-identify PDF documents using GDPR guidelines by anonymizing PHI information using out of the box Spark NLP models.