was successfully added to your cart.

In this webinar, we will delve into the intersection of data science innovation and regulatory compliance for de-identifying patient data across diverse sources and time points. This session focuses on three core capabilities of John Snow Labs’ software:

  • Consistent Obfuscation: Replace PHI fields—such as patient names, hospital names, and dates—with realistic yet fictitious counterparts in a gender- and context-aware manner. For instance, if “Jane Sunshine” is obfuscated to “Anne Boleyn,” any subsequent “Jane” will deterministically map to “Anne,” preserving referential consistency throughout the dataset.
  • Deterministic Tokenization: Transform patient identifiers (e.g., MRN or a composite of first name, last name, and birthdate) into cryptographic hashes. This ensures that subsequent records about the same individual—whether weeks or years later—are tokenized to the same value, enabling reliable linkage without exposing identifiable information.
  • Multimodal Linking: Seamlessly connect de-identified data spanning EHRs, claims, radiology reports (PDF), DICOM images, and free-text clinical notes. By applying consistent obfuscation and tokenization across formats, researchers can reconstruct longitudinal patient journeys while maintaining full compliance with privacy regulations.

John Snow Labs’ de-identification models have been rigorously evaluated in peer-reviewed benchmarks for PHI detection – surpassing Azure Health Data Services, AWS Comprehend Medical, OpenAI’s GPT-4o and GPT-4.5, and Claude Sonnet 3.7. This solution not only exceeds the threshold for regulatory-grade accuracy but also outperforms human experts and general-purpose LLMs, ensuring both compliance with HIPAA/GDPR and the highest standards for research validity. Join us to see how John Snow Labs delivers proven, cost-effective, and scalable de-identification so you can accelerate your data science initiatives under the strictest privacy frameworks.

Youssef Mellah
Youssef Mellah
Senior Data Scientist & Machine Learning Engineer at John Snow Labs

Youssef Mellah, Ph.D., is a Senior Data Scientist and Machine Learning Engineer at John Snow Labs, specialist with more than 8 years of experience in artificial intelligence, natural language processing, and deep learning. He specializes in building, training, and deploying regulatory-grade ML/DL models and large language models (LLMs) for healthcare and life sciences, including the de-identification and tokenization of multimodal medical data. Youssef has a strong track record designing scalable, privacy-preserving AI solutions that enable compliant research and analytics across structured and unstructured data. He is passionate about advancing NLP technology, leading multidisciplinary teams, and transforming cutting-edge research into practical, real-world applications.

Reserve Your Spot


preloader