Answering questions accurately based on information from financial documents, which can be a hundred or more pages long, is a challenge even for human domain experts. While traditional rule-based or expression-matching techniques work for simple fields in templated documents, it is harder to infer facts based on implied statements, on the absence of certain statements, or on the combination of other facts.
Answering such questions at a very high level of accuracy requires state-of-the-art deep learning techniques applied to NLP. Spark NLP was used to augment the UiPath smart data extraction platform in order to automatically infer fuzzy, implied, and complex facts from long financial documents.
This case study covers the technical challenges, the architecture of the full solution, and lessons learned that you can directly apply to your next data extraction project.
UiPath is a global software company that develops a platform for robotic process automation. Following its acquisition of both ProcessGold and StepShot in 2019, UiPath has become the first vendor of scale to bring together both process mining and robotic process automation.