was successfully added to your cart.

De-Identification Blog

This talk examines the crucial need for de-identifying protected health information (PHI) in unstructured patient-level data to harness its potential while ensuring compliance with legal and privacy requirements. With an abundance of sensitive data managed by healthcare providers and industry stakeholders, de-identification facilitates the creation of innovative healthcare and pharma solutions, benefitting various parties involved. The presentation addresses the reasoning behind de-identification and the stringent regulations enforced by the HIPAA and GDPR frameworks, emphasizing the balance between privacy and data utility. The talk also investigates the performance of manual and automated de-identification methods, assessing their accuracy and cost-effectiveness. While manual de-identification faces challenges in accuracy, consistency, and high costs, particularly for extensive datasets, automated de-identification supported by natural language processing (NLP) presents a practical alternative. In this context, the presentation outlines the capabilities of the Healthcare NLP library by John Snow Labs, which has showcased cutting-edge performance on standardized benchmarks.Built upon the Spark big data framework, the library offers tailored de-identification solutions, capable of processing millions of records on large Spark or Databricks clusters. John Snow Labs has consistently improved its solution, attaining an F1 score of 98.2% on the English n2b2 standard de-identification benchmark in 2022 and analogous results in other European languages. The talk highlights the importance of this advancement, which signifies a 70% reduction in the error rate compared to human benchmarks.

Blog

Sentiment analysis, also known as opinion mining, is the process of computationally identifying and categorizing the subjective information contained in natural language text. Spark NLP’s deep learning models have achieved...

The impact of Natural Language Processing in everyday life is hard to ignore as it is the main driver of emerging technologies like Robotics, Big Data, Internet of Things, etc....

Welcome to Part II of the blog series on extracting entities from text reviews using NLP Lab. In the first part of the blog, we discussed the challenges faced by...

John Snow Labs Finance NLP 1.14 comes with a lot of new capabilities added to the 155+ models and...

Legal NLP 1.14 comes with a lot of new capabilities added to the 926+ models and 125+ Language Models already...