Information captured in clinical text in the course of medical care is a rich potential source of research data. To be usable for research analyses such as comparative effectiveness studies, clinical events and characteristics must first be extracted in a structured form.
Extracting information from clinical text is a challenging problem for NLP algorithms because the text is inherently longitudinal, occurring over many notes in a sequence of visits. Accurately extracting the date of an event–for example a diagnosis, receipt of a drug, or a surgery–can be as important as extracting the event itself.
In this talk, I’ll present a deep learning architecture we’ve developed at Flatiron Health for extracting events and their dates from longitudinal clinical text. The architecture first encodes sentences potentially related to the event of interest from each note, then integrates across the patient chart using a novel time-aware aggregation layer. I’ll present results of using this architecture for extracting advanced diagnosis of non-small cell lung cancer, and discuss applications to other clinical events.