Dandelion Health is a provider of multimodal, longitudinal clinical data for healthcare innovators. This session shows how it built a de-identification process for free-text clinical notes, with John Snow Labs’ Healthcare NLP & LLM at its core. This process maintains patient privacy, minimizes risks for hospital systems, and preserves the bulk of free-text notes to provide researchers with high fidelity clinical data.Dandelion Health partners with hospital systems, deidentifies their clinical data in their environment, and then copies this data to the Dandelion data lake so that customers can perform research and validation within the secure Dandelion platform. To ensure HIPAA compliance, deidentification requires an expert determination to confirm that minimal protected health information (PHI) remains after the process.Tabular data is straight-forward to handle by removing or masking data fields with PHI related values – such as patient names, birth dates, addresses, or contact details. Free text patient notes are much more difficult to automatically deidentify, as this requires PHI words and phrases to be redacted or masked, after which the whole of the patient note must be verified.Key topics of the presentation include:1. Breaking down different note types (e.g. radiology reports, pathology reports, echo narratives, progress notes) according to level of risk, and adapting the de-id process accordingly.2. Assessing note subtypes (e.g. radiology reports for DEXA scans, or fetal radiology reports) in order to carve out exceptions to our standard process (e.g. unique note structure, or age formats such as “27w” that need to be redacted).3. Determining the importance of recall, precision, and PHI frequency for quasi-identifiers.4. Applying pre-processing or enhancements such as HIPS (hiding in plain sight) to reduce risk based on the recall, precision, and frequency of PHI in free-text notes. This presentation features real-world case-studies and examples, demonstrating the power of: validating clinician data-quality hypotheses with language models, using different NLP & LLM strategies for different datasets, and letting QA/QC statistics tell the story – so we know that we’re doing right by the patient.
Dandelion Health is a provider of multimodal, longitudinal clinical data for healthcare innovators. This session shows how it built a de-identification process for free-text clinical notes, with John Snow Labs’...
Join us in exploring the latest advancements in multimodal AI for extracting tabular data from visual documents. This session will delve into novel methods implemented in John Snow Labs’ Visual...
The rapid proliferation of advanced AI technologies has propelled numerous industries forward, but the smart home sector has yet to realize its full potential in the next-generation landscape. A true...
It has long been a norm that researchers extract knowledge from literature to design materials. However, the avalanche of publications makes the norm challenging to follow. Natural language processing (NLP)...