New advances in Natural Language Processing

Language is at the heart of human intelligence, so it is a core element in the efforts of building artificial intelligence.

The field of natural language processing (NLP) has been experiencing enormous advances over the past few years. This has been driven by two main technology breakthroughs.

Self-supervised learning

Today’s machine learning success is mostly based on labeled data, manually annotated by humans. The problem is that labeled data is very costly and time-consuming to obtain. Self-supervised learning allows training on unlabeled data, thanks to automatic label generation. A good example is a language model trained to predict the next word in a sentence. Text is widely available, and we can automatically split a sentence into its beginning and the final word to predict. Therefore, it costs nothing to create one labeled example, and the amount of textual data on the Internet is now the limit.

A new deep learning architecture known as the transformer

A transformer is a deep learning model that adopts the mechanism of self-attention, differentially weighting the significance of each part of the input data.

Transformers are designed to handle sequential input data, such as natural language, for tasks such as translation and text summarization. However, transformers do not necessarily process the data in order. Rather, the attention mechanism provides context for any position in the input sequence. For example, if the input data is a natural language sentence, the transformer does not need to process the beginning of the sentence before the end. Rather it identifies the context that confers meaning to each word in the sentence. This feature reduces training times.

Main applications of NLP

Next-generation language AI is making the leap from academic research to widespread real-world adoption, and will be transforming entire industries in the years ahead.

Developers have begun to apply cutting-edge NLP across sectors with a wide range of different product visions and business models. Given language’s importance throughout in society and economy, NLP will be one of the areas of technology having the most impact in the years ahead.

Building a powerful NLP model today is incredibly resource-intensive and technically challenging. Therefore, very few companies or researchers actually build their own NLP models from scratch. Instead, most NLP in use today is based on one of a small handful of massive language models from well-known companies and universities. GPT-3, from OpenAI, is perhaps the most well-known and widely used foundation model today. GPT-3 is a generative model: it generates original text in response to prompts from human users.

However, there is also tremendous opportunity in this category for younger startups like Label to innovate.

NLP has proven to be extremely useful for the following applications:

  • Search
  • Writing assistants
  • Translation
  • Sales intelligence
  • Chatbots
  • Conversational voice assistants
  • Contact centers
  • Content moderation

Menu

Discover the most common errors that financial institutions make when reporting FATCA & CRS