October 20 – October 21 , 2025 – 9:00 to 01:00 PM PST
Applied Generative AI for Data Scientists

Day 1: October 20, 2025 – 9:00 AM to 01:00 PM PST
Day 2: October 21, 2025 – 9:00 AM to 01:00 PM PST
Training Location: Online
Price: $395
This two-day Applied Generative AI training focuses on deploying medical LLMs and processing clinical data for real-world healthcare applications. This hands-on workshop includes live Python notebooks, no-code tools for domain experts, and production-grade system considerations.
Day 1 covers medical LLM foundations—benchmarking and deployment across major cloud platforms (SageMaker, Azure, Databricks, Snowflake)—plus hands-on experience with Generative AI Lab for building HIPAA-compliant annotation projects with human-in-the-loop workflows.
Day 2 focuses on clinical data processing and patient journeys, including healthcare-specific language models for information extraction, automated harmonization of multimodal data into OMOP format, medical terminology standardization across 30+ vocabularies, and automated risk adjustment and HCC coding using Martlet.ai.
This live online workshop is led by current contributors to the John Snow Labs codebase, with certification provided to attendees who pass the final exam.
October 20 – October 21, 2026 – 9:00 to 01:00 PM PST
Medical Language Models for Data Scientists
Hands-on Generative AI for Healthcare

Day 1: October 20, 2026 – 9:00 AM to 01:00 PM PST
Day 2: October 21, 2026 – 9:00 AM to 01:00 PM PST
Training Location: Online
Price: $395
This two-day, hands-on workshop is for practicing data scientists who build on John Snow Labs’ medical language models. You’ll get temporary license keys and run live Python notebooks that put the models to work on real clinical tasks.
Day 1 covers the core pipeline: de-identification and PHI removal across text, FHIR, PDF, and DICOM; clinical information extraction, including named entities, assertion status, and relation extraction; entity resolution to SNOMED CT, RxNorm, LOINC, ICD-10, and other medical terminologies; and medical LLM use for clinical reasoning, summarization, and question answering. You’ll work through the published, peer-reviewed benchmarks behind these models, and learn how to choose between them and evaluate them for different tasks and versus general-purpose frontier models.
Day 2 goes deep on visual language models: OCR and document understanding, DICOM and image de-identification, and multimodal extraction, with their benchmarks. It will then introduce the Generative AI Lab, an annotation and model training tool that powers regulatory-grade human-in-the-loop workflows. We’ll then cover key aspects in taking these solutions to production: deployment options across SageMaker, Azure, Databricks, and Snowflake, inference cost optimization, privacy and security, and scaling inside your own environment, where data never leaves your control.
The workshop is led by current contributors to the John Snow Labs codebase and ends with a certification exam.
October 20 – October 21, 2026 – 9:00 to 01:00 PM PST
Patient Journey Intelligence: Building OMOP-Based AI Agents on Multimodal Clinical Data

This one-day workshop shows architects, data engineers, and technical leaders how to turn messy, multimodal clinical data into a trusted, AI-ready foundation using the Patient Journey Intelligence Platform. The morning covers the platform architecture: how raw clinical notes, scanned PDFs, FHIR resources, DICOM images, lab feeds, and claims become a unified, enriched, and de-identified OMOP CDM dataset. We’ll then cover data governance capabilities including confidence scoring and full provenance on every extracted fact, versioning, audit trails, role-based access, and human-in-the-loop overrides.
The afternoon is hands-on, with access to a Patient Journey Intelligence environment you’ll ingest (synthetic) multimodal data, build cohorts with natural-language queries, explore complete patient timelines, and query the OMOP data directly over SQL, REST, and MCP. You’ll also build your own clinical AI agents on the same governed foundation, inheriting compliance and governance by default. Led by John Snow Labs product and engineering leaders, the day ends with a certification exam.
Watch On Demand
Building Human-in-the-loop AI Workflows with the Generative AI Lab
Comprehensive hands-on training and certification for those mastering the art of text annotation, crucial for the success of any LLMs
This is a comprehensive hands-on training designed for those keen on mastering the art of text annotation, crucial for the success of any Language Model training. Whether you are a beginner aiming to dip your toes or an expert looking to refine your skillset, this course offers insights and skills tailored for you.
- Preparing High-Quality Training Data for Language Models: Gain an in-depth understanding of how to prepare high-quality training data, the critical foundation for building robust Language Models.
- Mastering the Generative AI Lab: Obtain hands-on experience with John Snow Labs’ Generative AI Lab, a no-code high productivity tool designed for managing annotation and model training projects efficiently.
- Creating Effective Annotation Guidelines: Learn how to develop precise and comprehensive annotation guidelines. These guidelines are essential for maintaining consistency and accuracy in annotation tasks across teams.
- Data De-Identification: Explore techniques to safely de-identify sensitive data for use in annotation projects and for further analysis, ensuring data privacy and compliance.
- Tracking and Monitoring Progress: Discover best practices for tracking your annotation project’s progress and evaluating the quality of the annotations to ensure timely completion and high team productivity.
- Leveraging No-Code Pre-Annotations: Accelerate your project by utilizing John Snow Labs’ pre-trained models, rules, or Generative AI prompting to create pre-annotations. This not only speeds up the process but also enhances initial data quality.
- Seamless Custom Model Training: Learn how to train custom models with ease once you’ve gathered sufficient training data. Through an intuitive UI, models can be trained with just one click, yielding smaller, specialized models that are optimized for deployment on standard hardware.
- Model Testing and Evaluation: Test your custom models for accuracy, bias, fairness, performance, representation and robustness using over 70 out-of-the-box tests. You’ll also learn how to automatically augment your data to boost overall model’s quality.
Spark NLP for Data Scientists

Price: $19.99
This two-day workshop will walk you through building state-of-the-art Generative AI and Natural Language Processing (NLP) using John Snow Labs’ open-source libraries. This is a hands-on workshop for data scientists that will enable you to write and run live Python notebooks that put the technology to work.
The first day covers the open-source Spark NLP library for information extraction at scale – including reusing, training, and combining AI models for tasks like named entity recognition, text classification, spelling & grammar correction, question answering, knowledge extraction, sentiment analysis and more. The second day focuses on libraries and integrations specifically for preparing data for RAG LLM solutions, including document splitting, cleaning, metadata enrichment, summarization, and embeddings calculation.
The workshop is organized in two four-hour-long sessions, each followed by self-guided coding, on Python notebooks relevant to each section. This is a live online workshop whose instructors are current lead contributors to the John Snow Labs open-source codebase. A certification of expertise is provided to attendees who pass a final online exam.
Healthcare NLP for Data Scientists

Price: $19.99
This two-day workshop for practicing data scientists will enable you to use and extend John Snow Labs’ Healthcare NLP & LLM library. You will run live Python notebooks that walk through clinical entity recognition, entity resolution (mapping entities to medical codes), assertion status detection (i.e. negation detection), relation extraction, and de-identification. You will also learn how to train and tune your own models and how to choose which pre-trained models to start from.
The second day is focused on medical language models and will teach you how to make use of the medical text summarization, question answering, and text generation LLMs. It will also cover improving the accuracy of RAG LLM solutions that process medical documents by using healthcare-specific document splitting, enrichment, and embedding algorithms.
The workshop is organized in two four-hour-long sessions, each followed by self-guided coding. This is a live online workshop whose instructors are current lead contributors to the Healthcare NLP & LLM codebase.
Dates to be announced
Building Human-in-the-loop AI Workflows with the Generative AI Lab
Training & Certification

Dates to be announced
Training Location: Online
Price: $295
This is a comprehensive hands-on training designed for those keen on mastering the art of text annotation, crucial for the success of any NLP model training. Whether you are a beginner aiming to dip your toes or an expert looking to refine your skillset, this course offers insights and skills tailored for you.
What You Will Learn:
1. Prepare High-Quality Training Data for NLP Models: Understand the intricacies of preparing high-quality training data, the backbone for creating robust NLP models.
2. Mastering The NLP Lab: Acquire hands-on experience in executing a complete annotation project with John Snow Labs’ high-productivity annotation tool: the NLP Lab.
3. Creating Effective Annotation Guidelines: Dive deep into the development of precise annotation guidelines. These guidelines are pivotal in ensuring consistency and accuracy across data annotation tasks and team members.
4. Tracking and Monitoring: Equip yourself with strategies to monitor your project’s progress and gauge the quality of annotations. This ensures that the project remains on course and maintains the highest standard.
5. Laverage No-Code Pre-Annotations: Jumpstart your projects leveraging pre-annotations from John Snow Labs pre-trained models, rules, prompts, or even with ChatGPT prompting. This integration not only accelerates the process but also enhances the quality right from the outset.
6. Effortless Custom Model Training: When you accumulate sufficient annotated data, leverage the ability to train custom models effortlessly via an intuitive UI, with the click of a button.
Why Choose This Training?
– Experienced Instructors: Our trainers are not just academicians; they have spearheaded multiple large-scale data annotation projects. Their rich experience ensures a practical approach to teaching, grounded in real-world challenges and solutions.
– Interactive Learning: Unlike conventional lectures, our training is hands-on. Instructors will be available during assignments to clarify doubts, offering a continuous learning experience.
– Certification: The journey culminates in a certification exam. Those who engage in the hands-on exercises and clear the exam will be awarded a certificate, recognizing them as Certified Data Annotation Experts.
Step into the world of data annotation with confidence and competence. Enroll now and set yourself on the path to becoming an expert in delivering successful data annotation projects.
Dates to be announced
Applied Generative AI for Business Leaders
Training

Dates to be announced
Training Location: Online
Price: $295
Like any other technology, Natural Language Processing (NLP) requires business, product, and design leaders to understand what it can and cannot do – so that they can understand what opportunities it holds for improving their customers’ experience and bottom line with a solid ROI.
This five-hour live online workshop presents key concepts in NLP such as named entity recognition, document classification, transformers, pipelines, training, and annotation – but through the lens of real-world projects that got from concepts to production systems. The content is organized as a series of case studies, each starting with a business use case, diving into the solution architecture, explaining how NLP played part in the solution, and ends with best practices and lessons learned.
The instructor is a seasoned technology executive and NLP expert who oversaw the construction & operations of dozens of AI & NLP systems, and will be available for questions during and after the workshop.
Hear from students

I really appreciate their team’s willingness to help and am amazed by what can be done using Spark NLP!

John Snow Labs Spark NLP training was well organized and easy to follow, I enjoyed that the sessions were focused on coding but also included a more general overview of the process. In addition to the webinar training, many resources were made available to participants which was very helpful in taking the certification exam.

The Spark NLP training sessions covered an astonishing amount of material and have been a valuable resource for me to come back to whenever I am unsure of an implementation. John Snow Labs clearly put a great deal of effort into creating a broad ranging curriculum that builds upon previous lessons, while still emphasizing how simple it is to create a complex pipeline with Spark NLP.
The instructor was attentive to questions and feedback during and after the session, and the certification exam that followed challenged me to think and dig deeper than just the surface of the course material. I highly recommend this course to anyone who is new to Spark NLP or to those who want to bolster their confidence in working with it.
Training & Certification FAQ
We offer seven training courses. All are hands-on workshops and the instructors are current lead contributors to John Snow Labs’ software libraries:
- "Certified Spark NLP Data Scientists” focuses on the open source Spark NLP library
- "Certified Healthcare NLP for Data Scientists” focuses on the Healthcare NLP library
- "Certified Visual NLP for Data Scientists” focuses on the Visual NLP library
- "Certified Finance NLP for Data Scientists” focuses on the Finance NLP library
- "Certified Legal NLP for Data Scientists” focuses on the Legal NLP library
- "Applied Generative AI for Data Scientists" focuses on Generative AI use cases, the Medical Chatbot, and the Generative AI Lab.
- "Medical Language Models for Data Scientists" focuses on the John Snow Labs’ Healthcare NLP & LLM software.
Training courses are done online, with a live instructor.
The training courses are 1-2 days long. Each day includes four hours of live lectures and code walkthroughs.
Training courses are taught by current lead contributors to John Snow Labs’ software libraries who apply them in real-world projects on a day-to-day basis.
Of course. Live Q&A is encouraged and the instructor is also available to questions afterward.
We assume that you are a Python developer, know how to use its data science libararies, and are familiar with the basics of machine learning. Experience with Apache Spark is helpful but not required.
Yes. We can arrange courses to be done in person at your offices, and be customized to your specific use cases, programming language, or datasets.
We currently offer seven certifications, each matches the training course of the same name:
- “Certified Spark NLP Data Scientists” focuses on the open source Spark NLP library
- “Certified Healthcare NLP for Data Scientists” focuses on the Healthcare NLP library
- “Certified Visual NLP for Data Scientists” focuses on the Visual NLP library” focuses on the open source Spark NLP library
- “Certified Finance NLP for Data Scientists” focuses on the Finance NLP library
- “Certified Legal NLP for Data Scientists” focuses on the Legal NLP library
- “Certified Applied Generative AI Data Scientist”
- “Certified Medical Language Models Data Scientist”
Register for the next training course + certification exam - there’s one every quarter - and pass it!
Each training course is designed to help you prepare for one certification. When you register for a training course you’ll automatically be given access to the exam after the training.
The exam will be accessible right after the training and you can complete it up to 14 days after the training.
An official digital certificate from John Snow Labs that includes your name, the type of certification, and the date in which you earned it.
Forever. Remember though that employers who see a 2-year-old certification will likely ask you how you’ve been keeping your skills up to date.
At this time, we only offer training & certification for hands-on data scientists.
20-30 multiple-choice questions about John Snow Labs libraries’ features, code, models, and best practices.
It’s an online exam.
70%.
From the moment you start the exam, you have 90 minutes to complete it.
English.
It’s a multiple-choice exam so it’s graded automatically by counting the correct answers.
You'll see your results right after completing the exam. If you pass it, we will send you the certificate via email.
You'll have two attempts in total to pass the exam.
A desktop or laptop computer with a good Internet connection and a modern browser.
Not currently.
Download and run the software, run the Python notebooks relevant to the certification you’re taking, and read the documentation. The training courses are intended to prepare data scientists for the certification exam as well.
Registration is done through Eventbrite which accepts PayPal and all major credit cards.
Of course. You can download a receipt at the end of the checkout process.
Yes, but your company will have to pay it using the online form before the training starts. We do not currently support alternative payment methods or terms.
Yes! Please email us to describe your situation and needs.
Please email training@johnsnowlabs.com.
























