Easy to use, scalable NLP framework that can leverage Spark. Introduction of BERT based Relation Extraction models. State-of-the-art performance on Named Entity Recognition and Relation Extraction. Reported SOTA performance of multiple public benchmark datasets. Application of these models on real-world use-cases.
We present a text mining framework based on top of the Spark NLP library — comprising of Named Entity Recognition (NER) and Relation Extraction (RE) models, which expands on previous work in three main ways. First, we release new RE model architectures that obtain state-of-the-art F1 scores on 5 out of 7 benchmark datasets. Second, we introduce a modular approach to train and stack multiple models in a single nlp pipeline in a production grade library with little coding. Third, we apply these models in practical applications including knowledge graph generation, prescription parsing, and robust ontology mapping.