John Snow Labs NLP on Databricks

Start with a 30-day free trial with no limit on the amount of processed data. Provide a Databricks access token for installing John Snow Labs NLP libraries on your Databricks instance, on a cluster of your choice.
Get full access to:
Spark NLP
State-of-the-art natural language processing for Python, Java, or Scala
Healthcare NLP
State-of-the-art clinical & biomedical natural language processing
Spark OCR
Scalable, private, and highly accurate OCR and de-identification library
20+ ready-to-use
20+ ready-to-use Jupyter Notebook
Then, Pay Only for What You Use
  • Switch to a pay-as-you-go subscription after the trial
  • Priced at $4.95/DBU
  • Billed once a month directly by John Snow Labs

Frequently Asked Questions

Getting Started

John Snow Labs NLP package includes:

  • Spark NLP library
  • Spark NLP for Healthcare library
  • Spark OCR library
  • Access to all pre-trained models and pipelines published on the NLP Models Hub
  • Access to 20+ Jupiter notebooks for the most common NLP tasks
  • Priority support
  • All updates to the software & models that are released during the subscription period.

Use the above form to start a 30-day free trial. Once you’ve submitted the form and validated your email, we will automatically install the software, license keys, and demo notebooks in your Databricks workspace. You’ll get an email when it’s ready and can then run the notebooks on Databricks.

The software will stop processing documents. If you choose to buy a license, we will provide you new credentials that will reactivate it. Otherwise, you must uninstall the software. In any case, data you have already processed is yours to keep.

During the 30-day free trial, we will email you a link to activate a paid license and provide a payment method. If you choose to do so, we will generate a commercial subscription and update your Databricks cluster with an activated commercial license key.

No. Once you have a valid subscription, you can use John Snow Labs NLP on any number of clusters, jobs, or documents - and share it with as many users as you want inside of your Databricks account.

The Spark NLP library is free, forever, unlimited, for personal and commercial use. Spark NLP is released under an Apache 2.0 open-source license – including the pre-trained models and documentation.

No. One 30-day free trial is allowed for each account, which can be shared between all its users.

Running John Snow Labs NLP on Databricks

Python and Scala.

We currently support AWS and Azure. Supporting for GCP Databricks is coming soon.

The recommended starter configuration for AWS is r4.2xlarge - autoscaling with 1 min worker and 4 max workers.
The recommended starter configuration for Azure is Standard_DS13_v2 - autoscaling with 1 min worker and 4 max workers.

Yes. An installation script for on-demand job clusters is provided.

Yes. Choosing to install the software on a GPU runtime causes the GPU-optimized build of John Snow Labs NLP to be installed on it.


The John Snow Labs NLP package is priced at $4.95/DBU. You will be billed monthly based on the total number of DBU’s on all Databricks clusters on which John Snow Labs’s software was installed.

Yes. John Snow Labs will bill you directly for the licensed software. Databricks will bill you separate for the use of its platform. You may use different payment methods for both payments.

No. John Snow Labs subscription only includes access to the NLP libraries. Databricks subscription costs are invoiced and paid separately.

No. You only pay for what you use. Spin up your cluster, use it according to your needs and pause it when you are done. You will be charged according to the consumed DBUs.

Since we have no visibility into your Databricks account other than knowing in which clusters the John Snow Labs software is installed, you will be billed for every DBU on those clusters. Therefore, consider splitting your NLP-specific workloads into dedicated clusters, so that most compute work within those clusters takes advantage of the John Snow Labs NLP software. Also consider pausing unused clusters and using on-demand and job clusters when possible.

We will debit your payment method monthly, based on the total number of DBU’s in which the John Snow Labs software was installed during the previous month.

Online payments via all major credit cards.

Payments are handled via Stripe, a PCI Service Provider Level 1 which is the highest grade of payment processing security. You can rest assured that your payment information is safe and secure.

Yes! Please email us to describe your situation and needs.


No. You install and run the software on your Databricks infrastructure. The software does not “call home” and no data or results are sent to John Snow Labs.

You do. We will never even see them.

This is not a SaaS solution – instead, you run the software on your infrastructure and are fully responsible for protecting it. The libraries do not send any data to John Snow Labs.



Yes. John Snow Labs NLP is designed to enable you to train & tune your own models for most tasks.

Yes. A custom deployment script is provided for all John Snow Labs NLP subscriptions (trials included) that you can attach to any new cluster and run for a frictionless installation and configuration of your new cluster.

Yes. This Python notebook shows how it’s done in its “NERDL section”.

Yes. An example is presented in  this blog post

Absolutely! An example is provided in this Python notebook

The full list is available here. Expect the list to keep growing over time.


Email, call us at +1-302-786-5227, or start a chat on

Same business day 9x5 support is included with all subscriptions. We can also provide 24x7 support for production systems – please email us if you require it.

Yes. When subscribing to John Snow Labs NLP on Databricks you get 20+ ready to use Python notebooks that will help you speed up your project and solve the most common NLP tasks.