Major developments in NLP in the past few years have revolutionized the way we do everything from translation to fraud detection.
But, as with all machine learning technologies, these techniques are subject to the limitations and biases in the data from which they learn.
From the case of racist language infecting Microsoft’s Tay chatbot to the failure of Amazon’s resume ranking system due to gender bias, to the failures of Facebook’s algorithm to detect dangerous posts, there are a number of examples where these limitations can have serious consequences for companies and the general public.
In this talk, I will provide an overview of some of the issues to consider in training and using word embeddings and language models and some methods and tools to help address them.
Though no approach is perfect, without inclusive development processes and careful design and review, these methods can do more harm than good.