Register for the 5th NLP Summit, a Free Online Conference on Sep 24-26. Register now.
was successfully added to your cart.

John Snow Labs’ Healthcare Data Library with 2,400+ Curated Datasets Is Generally Available on the Databricks Marketplace

John Snow Labs Debuts Comprehensive Healthcare Data Library on Databricks Marketplace: Over 2,400 Expertly Curated, Clean, and Enriched Datasets Now Accessible, Amplifying Data Science Capabilities in Healthcare and Life Sciences.

John Snow Labs, the Healthcare AI and NLP company and developer of the Spark NLP library, is pleased to announce the general availability of its comprehensive Healthcare Data Library on the Databricks Marketplace.

With 2,400+ expertly curated datasets that span healthcare, medical terminology, life sciences, and societal sectors, this massive data library promises to revolutionize the capabilities of data scientists across the industry.

Databricks users now have immediate access to more than 1,300 datasets in healthcare, 600 in medical terminology, 450 in life sciences, and 260 in societal data.

All datasets are carefully selected, cleaned, enriched, and unified into one coherent type system by John Snow Labs’ team of subject matter experts. Each dataset undergoes three rigorous quality review stages to ensure the highest data quality and consistency.

“Databricks is excited to provide such broad and high-quality medical datasets within the Databricks Marketplace,” says Mike Sanky, Global Industry Lead at Databricks. “Production-grade healthcare systems require clean, robust, and up-to-date data, and we’re happy to make John Snow Labs’ expert-curated datasets easily available right within the Databricks platform.”

Moreover, John Snow Labs is introducing 70+ Databricks starter notebooks to facilitate the integration of its healthcare data library into various analytics and AI projects. The data is regularly updated, and is available in a variety of formats with enriched metadata. John Snow Labs is proud to offer a dual licensing model. This enables the data to be freely available for research purposes, with a commercial permissive license also available.

David Talby, CTO at John Snow Labs, says, “Making John Snow Labs’ entire Healthcare Data Library available directly from the Databricks Marketplace makes it easier for data & AI teams to build end-to-end solutions that are reliable, scalable, and secure. We’re thrilled to see what customers will build with these newly integrated capabilities.”

With this integration, John Snow Labs and Databricks are empowering data scientists with the most comprehensive and high-quality data resources. They can now leverage these tools to drive innovation and advancement in healthcare and life sciences sectors.

For more information about the John Snow Labs Healthcare Data Library in the Databricks Marketplace, please visit

FDA’s Sentinel Innovation Center chose Cerner Enviza and John Snow Labs to Develop Innovative AI Tools for Drug Safety and Real-World Evidence Studies

Project will demonstrate how the use of machine learning and natural language processing (NLP) technology with unstructured EHR data may help fill...