Watch Healthcare NLP Summit 2024. Watch now.
was successfully added to your cart.

Finance NLP Releases bug fixes on de-identification pipelines

The latest version of the library fixed some relevant errors on the deidentification pipelines on financial documents. With the fixes, the library is fully compatible with newer versions of Spark.


De-identification pipelines can be used to remove private or personal information from financial documents. This pipeline of NLP for financial services can be used to remove the information by masking it with entity labels, special characters, or obfuscating (changing with synthetic data). Use it with the PretrainedPipeline named finpipe_deid :


Masking with entity labels:

Masking with special chars:

Masking with fixed-length chars:


Fancy trying?

We’ve got 30-days free licenses for you with technical support from our financial team of technical and SME. This trial includes complete access to more than 150 models, including Classification, NER, Relation Extraction, Similarity Search, Summarization, Sentiment Analysis, Question Answering, etc. and 50+ financial language models.

Just go to and follow the instructions!

Don’t forget to check our notebooks and demos.

How to run

Finance NLP is quite easy to run on both clusters and driver-only environments using johnsnowlabs library:

!pip install johnsnowlabs
from johnsnowlabs import nlp


Then we can import the Finance NLP module and start working with Spark.

from johnsnowlabs import finance
# Start Spark Session
spark = nlp.start()

For alternative installation methods of how to install in specific environments, please check the docs.

Finance NLP Releases Semantic search Example Notebook

The latest version of Finance NLP adds an example notebook showing how to perform semantic search on vector stores. Financial Semantic Search...