Get to Know Natural Language Processing
Learning Objectives
After completing this unit, you’ll be able to:
- Describe natural language processing.
- Discuss everyday uses of natural language processing.
- Explain how it has evolved since the 1950s.
- Differentiate between natural language processing, natural language understanding, and natural language generation.
Trailcast
If you'd like to listen to an audio recording of this module, please use the player below. When you’re finished listening to this recording, remember to come back to each unit, check out the resources, and complete the associated assessments.
Before You Start
This badge contains terms like neural networks and deep learning that are described in detail in the Artificial Intelligence Fundamentals and Generative AI Basics badges. We recommend that you earn those badges first.
What Is Natural Language Processing?
Natural language processing (NLP), is a field of artificial intelligence (AI) that combines computer science and linguistics to give apps and AI assistants the ability to understand, interpret, and generate human language in a way that’s meaningful and useful to humans. NLP helps apps, AI assistants, and autonomous agents perform tasks like understanding the meaning of sentences, recognizing important details in text, translating languages, answering questions, summarizing text, and generating responses that resemble human responses.
NLP is already so commonplace in our everyday lives that we usually don’t even think about it when we interact with it or when it does something for us. For example, many people use ChatGPT to generate or summarize text or answer questions. Email or document creation apps automatically suggest words or phrases we could use next. You may ask a virtual assistant, like Siri, to do something for you, like remind you to water your plants on Tuesdays. Or, you might use autonomous agents to book a vacation, including transportation and tours around your destination.
The agents you engage with when you contact a company’s customer service use NLP, and so does the translation app you use to help you order a meal in a different country. Spam detection, your online news preferences, and so much more rely on NLP.
A Very Brief History of NLP
It’s worth mentioning that NLP is not new. In fact, its roots wind back to the 1950s when researchers began using computers to understand and generate human language. One of the first notable contributions to NLP was the Turing Test. Developed by Alan Turing, this test measures a machine’s ability to answer any question in a way that’s indistinguishable from a human. Shortly after that, the first machine translation systems were developed. These were sentence- and phrase-based language translation experiments that didn’t progress very far because they relied on very specific patterns of language, like predefined phrases or sentences.

By the 1960s, researchers were experimenting with rule-based systems that allowed users to ask the computer to complete tasks or have conversations.
The 1970s and 80s saw more sophisticated knowledge-based approaches using linguistic rules, rule-based reasoning, and domain knowledge for tasks like executing commands and diagnosing medical conditions.
Statistical approaches (i.e., learning from data) to NLP were popular in the 1990s and early 2000s, leading to advances in speech recognition, machine translation, and machine algorithms. During this period, the introduction of the World Wide Web in 1993 made vast amounts of text-based data readily available for NLP research.

Since about 2009, neural networks and deep learning have dominated NLP research and development. NLP areas of translation and natural language generation, including ChatGPT, have vastly improved and continue to evolve rapidly.
Human Language Is “Natural” Language
What is natural language anyway? Natural language refers to the way humans communicate with each other using words and sentences. It’s the language we use in conversations, and when we read, write, or listen. Natural language is the way we convey information, express ideas, ask questions, tell stories, and engage with each other on social media. But how does AI interpret natural language? To answer that, we need to look at how information and data are structured.
Note: While NLP models have been developed for many different human languages, this module focuses on NLP in the English language.
Structured and Unstructured Data
In the past, for a computer to understand what we mean, information needed to be well-defined and organized, similar to what you might find in a spreadsheet or a database. This is called structured data. The information included in structured data and how the data is formatted is ultimately determined by algorithms used by the end application, and usually requires additional data entry or data parsing.
Here’s how the data about an adoptable shelter dog might look as structured data in a database that helps match pets with potential adopters. Think about how output from this type of data, like search results for a particular type of pet or a description for a website, would be formulaic and limited to specific uses.
- Name: Tala
- Age: 5
- Spayed or Neutered: Spayed
- Sex: Female
- Breed: Husky
- Weight: 65 lbs.
- Color: Gray and white
- Eye color: Blue
- Good with children: Yes
- Good with cats: Yes
- Favorite activities: Parks, hikes, being brushed
- Location: Troutdale
However, natural language–the way we actually speak–is unstructured, meaning that while we humans can usually derive meaning from it, AI needs tools like retrieval-augmented generation (RAG) to connect a business’ data or knowledge base to large language models (LLMs) to make sense of and enhance the context and accuracy of text, speech, and generated outcomes.
The following paragraph is an example of how the same information about a shelter dog, presented as unstructured data, can be used by AI to provide much more contextually and conversationally rich output across many use cases.
Tala is a 5-year-old spayed, 65-pound female husky who loves to play in the park and take long hikes. She is very gentle with young children and is great with cats. This blue-eyed sweetheart has a long gray and white coat that will need regular brushing. You can schedule a time to meet Tala by calling the Troutdale shelter.
Natural Language Understanding and Natural Language Generation
Today’s NLP matured with its two subfields, natural language understanding (NLU) and natural language generation (NLG). Data processed from unstructured to structured is called natural language understanding (NLU). NLU uses many techniques to interpret written or spoken language to understand the meaning and context behind it. You learn about these techniques in the next unit.
Data processed the reverse way–from structured to unstructured–is called natural language generation (NLG). NLG is what enables AI assistants to generate human-like language. NLG involves the development of algorithms and models that convert structured data or information into meaningful, contextually appropriate, natural-like text or speech. It also includes the generation of code in a programming language, such as generating a Python function for sorting strings.
In the past, NLU and NLG tasks made use of explicit linguistic structured representations like parse trees. While NLU and NLG are still critical to NLP today, most of the apps, tools, and virtual assistants we communicate with have evolved to use deep learning or neural networks to perform tasks from end-to-end. For instance, a neural machine translation system may translate a sentence from, say, Chinese, directly into English without explicitly creating any kind of intermediate structure. Neural networks recognize patterns, words, and phrases to make language processing exponentially faster and more contextually accurate.
In the next unit, you learn more about our natural language methods and techniques that enable AI assistants make sense of what we say and respond accordingly.
