With the growing risk of fast pace spread of COVID-19 across the globe, there is an extreme need for a potential way forward or approaches to break the chain if not cure.
Currently, there is significant research & literature around the similar situation during previous epidemics spread which may not be specific for the current situation but still very valuable. This might provide us with the right approaches and improved policy measures that will aid us to fight this battle.
We, like Natural Language Processing (NLP) researchers, hope to leverage this research, ideas, reports, or any data to find close to accurate and quickly actionable insights to control the spread via medical or non-pharma interventions.With this, we hope to bring in our approach which can help community members to find the right literature using the methods of NLP, Deep Learning & Search.
As part of the R&D activity, we participated in one of the well renowned and prestigious TREC (Text REtrieval Challenge) organized by NIST (National Institute of Standards and Technology). This is TREC-COVID which aims at building a Pandemic Retrieval Test Collection Challenge to build and come up with a model for information retrieval from the CORD-19 dataset of literature articles.
As part of this activity, we developed an NLP & deep learning-enabled engine which can accept natural language/free text dynamic queries and retrieve top N articles from the offline repositories of PubMed Central, WHO, bioRxiv & medRxiv corpora. The algorithm also returns by highlighting the specific sentences/section where the answer can be found for the input query.
It also computes the confidence score associated with every hit to determine the score for each hit corresponding to the input query. The beta version of the solution is available and can be accessed using the link provided below.
As part of the further phases of this solution, we are working towards adding the functionality of Question-Answering (QnA) system that would fetch exact answers to questions, instead of longer sentences, paragraphs, or documents from where the user must find the answer. Also, depending upon the feedback from the Business & RnD users, the phase of the solution will incorporate the concepts of reinforcement learning in the coming future.
We hope to improve our QnA system using reinforcement learning techniques to dynamically improve the retrieval process of the engine. As the next step, this engine is planned to be integrated with other COVID-19 applications developed at Merck.