Spark NLP Training

Day 1: April 12, 2022 – 9:00 to 13:00 PST

Day 2: April 13, 2022 – 9:00 to 13:00 PST


Training Location: Online

Price: $495

Spark NLP for Data Scientists
Training & Certification

Natural Language Processing (NLP) is a key component in many data science systems that must understand or reason about text. Common use cases include knowledge extraction, question answering, entity recognition, spell correction, sentiment analysis, and document classification.

This two-day workshop will walk you through state-of-the-art natural language processing (NLP) using John Snow Labs’ open-source Spark NLP library. This is a hands-on workshop that will enable you to write, edit, and run live Python notebooks that cover the majority of the open-source library’s functionality.

The workshop is organized in two three hour-long sessions, each followed by 30 minutes of self-guided coding, on Python notebooks relevant to each section. This is a live online workshop and the instructors will be available during the self-guided sessions to answer questions.

The instructor is a seasoned technology executive and NLP expert who oversaw the construction & operations of dozens of AI & NLP systems, and will be available for questions during and after the workshop.

Register now

Day 1: April 14, 2022 – 9:00 to 13:00 PST

Day 2: April 15, 2022 – 9:00 to 13:00 PST


Training Location: Online

Price: $495

Spark NLP for Healthcare Data Scientists
Training & Certification

Many critical facts required by healthcare AI applications – like patient risk prediction, cohort selection, automated clinical coding, and clinical decision support—are locked in unstructured free-text data. Recent advances in deep learning have raised the bar on achievable accuracy for tasks like named entity recognition, assertion status detection, entity resolution, deidentification, and others.

Spark NLP for Healthcare provides production-grade, scalable, and trainable implementation of novel healthcare-specific natural language processing (NLP) algorithms and models. The product is licensed by John Snow Labs, the creator of Spark NLP, and provides data scientists with a library and pre-trained models for the most common medical NLP tasks. This is a hands-on workshop that will enable you to write, edit, and run Python notebooks that use the product’s functionality.

Register now

Dates to be announced


Training Location: Online

Price: $495

Delivering Successful Data Annotation Projects
Training & Certification

Data Annotation is an important part of Natural Language Processing (NLP) projects. To train a successful NLP model, it is necessary to extract data in an accurate and consistent way, combining different features such as Named-Entity Recognition (NER), Assertion Status Detection, Relation Extraction, and Text Classification.

During this training, you will develop key skills to carry out a complete annotation project using John Snow Labs’ high-productivity annotation tool: The Annotation Lab. You will also learn and practice how to develop effective Annotation Guidelines, best practices for leading a team of annotators to ensure accurate results, and how to track your project’s progress and the quality of your annotations.

The instructors have led multiple large data annotation projects and will be available during the assignments to answer questions. The course ends with a certification exam – completing the hands-on exercises and passing the exam grants a certificate as a Certified Data Annotation Expert.

Dates to be announced


Training Location: Online

Price: $495

Natural Language Processing for Business Leaders

Like any other technology, Natural Language Processing (NLP) requires business, product, and design leaders to understand what it can and cannot do – so that they can understand what opportunities it holds for improving their customers’ experience and bottom line with a solid ROI.

This five-hour live online workshop presents key concepts in NLP such as named entity recognition, document classification, transformers, pipelines, training, and annotation – but through the lens of real-world projects that got from concepts to production systems. The content is organized as a series of case studies, each starting with a business use case, diving into the solution architecture, explaining how NLP played part in the solution, and ends with best practices and lessons learned.

The instructor is a seasoned technology executive and NLP expert who oversaw the construction & operations of dozens of AI & NLP systems, and will be available for questions during and after the workshop.

Hear from students

Spark NLP

I really appreciate their team’s willingness to help and am amazed by what can be done using Spark NLP!

Angelina Leigh, Hitachi Data Systems
Spark NLP

John Snow Labs Spark NLP training was well organized and easy to follow, I enjoyed that the sessions were focused on coding but also included a more general overview of the process. In addition to the webinar training, many resources were made available to participants which was very helpful in taking the certification exam.

Sarah Gans, Whiteswan
Spark NLP

The Spark NLP training sessions covered an astonishing amount of material and have been a valuable resource for me to come back to whenever I am unsure of an implementation. John Snow Labs clearly put a great deal of effort into creating a broad ranging curriculum that builds upon previous lessons, while still emphasizing how simple it is to create a complex pipeline with Spark NLP.


The instructor was attentive to questions and feedback during and after the session, and the certification exam that followed challenged me to think and dig deeper than just the surface of the course material. I highly recommend this course to anyone who is new to Spark NLP or to those who want to bolster their confidence in working with it.

Ethan Hill, McKesson

Training & Certification FAQ


We offer two training courses. Both are designed to teach hands-on data scientists to use Spark NLP:

  • "Spark NLP for Data Scientists” focuses on the open-source library
  • "Spark NLP for Healthcare Data Scientists” focuses on Spark NLP for Healthcare

Training courses are done online, with a live instructor.

Each training course is two days long. Each day includes four hours of live lectures and code walkthroughs.

Training courses are taught by a senior data scientist who is an active committer to Spark NLP who applies it in real-world projects on a day-to-day basis.

Of course. Live Q&A is encouraged and the instructor is also available to questions afterward.

We assume that you are a Python developer, know how to use its data science libararies, and are familiar with the basics of machine learning. Experience with Apache Spark is helpful but not required.

Yes. We can arrange courses to be done in person at your offices, and be customized to your specific use cases, programming language, or datasets.


We currently offer two certifications, each matches the training course of the same name:

  • “Certified Spark NLP Data Scientist” focuses on the open source Spark NLP library
  • “Certified Spark NLP for Healthcare Data Scientists” focuses on the healthcare library

Register for the next certification exam – there's one every quarter – and pass it!

Each training course is designed to help you prepare for one certification. You can register for a training & certification combo, or choose to register for just the exam.

You have 9 days to start the exam from the day it opens (right at the end of a training course). Once you start the exam, you have 90 minutes to finish it.

An official digital certificate from John Snow Labs that includes your name, the type of certification, and the date in which you earned it.

Forever. Remember though that employers who see a 2-year-old certification will likely ask you how you’ve been keeping your skills up to date.

At this time, we only offer training & certification for hands-on data scientists.


20 multiple-choice questions about Spark NLP features, code, models, and best practices.

It’s an online exam.


From the moment you start the exam, you have 60 minutes to complete it.


It’s a multiple-choice exam so it’s graded automatically by counting the correct answers.

After you’ve completed the exam, expect to get an email from us within 1-2 business days with the result – and your certificate if you passed.

You’re welcome to try again, at least 3 months later.

A desktop or laptop computer with a good Internet connection and a modern browser.

Not currently.

Download and run Spark NLP, run the Python notebooks relevant to the certification you’re taking, and read the documentation. The training courses are intended to prepare data scientists for the certification exam as well.


Registration is done through Eventbrite which accepts PayPal and all major credit cards.

Of course. You can download a receipt at the end of the checkout process.

Yes, but your company will have to pay it using the online form before the training starts. We do not currently support alternative payment methods or terms.

Yes! Please email us to describe your situation and needs.