was successfully added to your cart.

Natural Language Processing with PySpark and Spark-NLP

Today we’re diving deeper into the US Consumer Financial Protection Bureau’s Financial Services Consumer Complaint database to look at the text of the complaints filed against companies. The question: what words (from complaints) are distinctly Equifax-y? We’re going to be looking at text cleaning, tokenization, and lemming with Spark-NLP, counting with PySpark, and tf-idf (term frequency-inverse document frequency) analysis.

Spark in Docker in Kubernetes: A Practical Approach for Scalable NLP

Natural Language Processing using the Google Cloud Platform’s Kubernetes Engine