Watch Healthcare NLP Summit 2024. Watch now.
was successfully added to your cart.

1 Line of Code to Train A Multilingual Text Classifier for 100+ Languages with NLU 1.1.4

We are very excited to announce NLU 1.1.4 has been released and comes with a lot of tutorials showcasing how you can train a multilingual text classifier on just one starting language which then will be able to classify labels correct for text in over 100+ languages.


This is possible by leveraging the language-agnostic BERT Sentence Embeddings(LABSE). In addition to that tutorials for English pure classifiers for stock market sentiment, sarcasm and negations have been added. 

You can train a classifier with default USE embeddings in just 1 line

 

 

You can use any other embedding by specifying it before the classifier reference

If you train with LABSE your model will understand 100+ languages, even if you train only in one language!

 

 

Finally, this release makes working in Spark environments easier, by providing a way to get a Spark DF, regardless of your input data.

 

 

New NLU Multi-Lingual training tutorials

These notebooks showcase how to leverage the powerful language-agnostic BERT Sentence Embeddings(LABSE) to train a language-agnostic classifier.
You can train on one start language(i.e. English dataset) and your model will be able to correctly predict the labels in every one of the 100+ languages of the LABSE embeddings.

 

 

 

New NLU training tutorials (English)

These are simple training notebooks for binary classification for English