Finance NLP Releases bug fixes on de-identification pipelines

The latest version of the library fixed some relevant errors on the deidentification pipelines on financial documents. With the fixes, the library is fully compatible with newer versions of Spark.


De-identification pipelines can be used to remove private or personal information from financial documents. This pipeline of NLP for financial services can be used to remove the information by masking it with entity labels, special characters, or obfuscating (changing with synthetic data). Use it with the PretrainedPipeline named finpipe_deid :


How to run

Finance NLP is quite easy to run on both clusters and driver-only environments using johnsnowlabs library:

!pip install johnsnowlabs
from johnsnowlabs import nlp


Then we can import the Finance NLP module and start working with Spark.

from johnsnowlabs import finance
# Start Spark Session
spark = nlp.start()

For alternative installation methods of how to install in specific environments, please check the docs.

Finance NLP Releases Semantic search Example Notebook

The latest version of Finance NLP adds an example notebook showing how to perform semantic search on vector stores. Financial Semantic Search...