At this talk, we will discuss the system-centric approach to the NLP. There was huge progress in NLP modeling based on transformer-based neural architectures.
We will discuss how to build a large-scale NLP system supporting high loads by the number of queries per second or amounts of documents to be processed, a large number of models, a large number of scientists working in parallel.
We will talk about multiple different approaches to build production scale NLP systems and we will discuss different tradeoffs in building such systems.