The immense variety of terms, jargon, and acronyms used in medical documents means that named entity recognition of diseases, drugs, procedures, and other clinical entities isn’t enough for most real-world healthcare AI applications. For example, knowing that “renal insufficiency”, “decreased renal function” and “renal failure” should be mapped to the same code, before using that code as a feature in a patient risk prediction or clinical guidelines recommendation model, is critical to that’s model’s accuracy. Without it, the training algorithm will see these three terms as three separate features and will severely under-estimate the relevance of this condition.
This need for entity resolution, also known as entity normalization, is therefore a key requirement from a healthcare NLP library. This webinar explains how Spark NLP for Healthcare addresses this issue by providing trainable, deep-learning-based, clinical entity resolution, as well as pre-trained models for the most commonly used medical terminologies: SNOMED-CT, RxNorm, ICD-10-CM, ICD-10-PCS, and CPT.