Get Started

NLP Libraries
State-of-the-art Natural Language Processing in Python, Java & Scala:
  • Spark NLP
  • Python’s NLU
  • 4,000+ Pretrained models
30-Day Trial
  • Healthcare NLP
  • Spark OCR
Annotation Lab
A highly efficient annotation tool for enterprise teams:
  • Unlimited projects, users, documents, models
  • Teams, Analytics, Workflows, Audit
  • No-Code Model Training & Active Learning
30-Day Trial
  • Train & Tune Healthcare Models
  • Annotate Text on Images, Forms, and PDF
NLP Server
No-Code user interface and API for running your NLP workloads:
  • Unlimited servers, documents, models
  • Run on your own air-gap infrstructure
  • 4,000+ Pretrained models & pipelines
30-Day Trial (COMING SOON)
  • Healthcare NLP
  • Spark OCR

Frequently Asked Questions


Free, forever, unlimited, for personal and commercial use. Spark NLP is released under an Apache 2.0 open-source license – including the pre-trained models and documentation.

Each license includes the software libraries in all supported languages, the pre-trained models that are included with it, premium support, and all updates to the software & models that are released during the subscription period.

Spark NLP for Healthcare and Spark NLP & OCR are licensed as an annual subscription, payable once a year in full. There are two license types: Per Server, which allows use of the software on one machine; and Per Cluster, which allows use of the software on an unlimited Apache Spark cluster.

No. The only limitation is that each license allows using the software on one server or one cluster, based on the license type you choose.

The software will stop processing documents – for both training and inference. If you choose to buy a license, we will provide you new credentials that will reactivate it. Otherwise, you must uninstall the software. In any case, data you have already processed is yours to keep.

Running the Software

Python, Java, and Scala.

Spark 2.3.x and 2.4.x.

We officially support AWS, Azure, Databricks, Cloudera, and GCP.

Yes. Spark NLP is used heavily in high-compliance industries like healthcare, life science, finance, and insurance where on-premise deployments are common. Most single-machine, Spark, Hadoop, and Kubernetes distributions are supported.

Yes. Make sure to allocate enough memory & compute power for your use case.

Yes. Make sure to allocate enough memory & compute power for your use case.

This depends heavily on your use case. For training custom models based on the BERT family of embeddings, at least 8 cores and 64GB of memory are recommended. For inference, as little as 1 core and 8GB may be enough. Using GPU’s will usually provide faster execution at a higher cost.


The cost depends on which edition you need (Healthcare or OCR), the type of license (per server or per cluster), the level of support (8x5 or 24x7), and the number of licenses you need. Please email us with those details and we’ll reply with an exact quote.

Online bank transfers (ACH or wire), checks, and all major credit cards.

Yes! Please email us to describe your situation and needs.


No. You install and run the software on your infrastructure. The software does not “call home” and no data or results are sent to John Snow Labs.

You do. We will never even see them.

This is not a SaaS solution – instead, you run the software on your infrastructure. Nothing ever gets sent to John Snow Labs or another third party. Spark NLP is designed for high-compliance, locked-down environments.

No, after an initial installation & downloading of pre-trained models.



Yes. Spark NLP is designed to enable you to train & tune your own models for most tasks.

The full list is available here. Expect the list to keep growing over time.


Email, call us at +1-302-786-5227, or start a chat on Paying customers get a private Slack channel, so that you can ask your questions privately.

Same business day 8x5 support is included with all subscriptions. We can also provide 24x7 support for production systems – please email us if you require it.

Yes. Spark NLP in Action includes links to runnable Google Colab notebooks in Python.