was successfully added to your cart.

Distributed Topic Modelling using Spark NLP and Spark MLLib(LDA)

Topic Modelling is one of the most common tasks in natural language processing. Extracting topic distribution from millions of documents can be useful in many ways e.g. identifying the reasons for complaints about a particular product or all products or a more classic example of identifying topics in news articles. We won’t delve into the details about what topic modeling is or how it works. There are so many good articles about it on the internet but I find this article from Analytics Vidhya comprehensive. So if you are not familiar with topic modeling or need to refresh your memory, go ahead and check it out.

Running Spark NLP in Docker Container for Named Entity Recognition and Other NLP Features

Using Spark NLP with Jupyter notebook for natural language processing in Docker environment