Start tracking your progress
Trailhead Home
Trailhead Home

Discover the Science Behind Voice

Learning Objectives

After completing this unit, you'll be able to:

  • Articulate the basics of how voice computing works.
  • Describe automated speech recognition and natural language understanding.

So, How Does Einstein Voice Actually Work?

On the way back to Hans’s office after a meeting, he begins to dictate his meeting notes into his phone. 

“Hey Einstein, take a note. It looks like the opportunity with Relativity Stores is bigger than I thought. It's probably worth about $500,000. I need to follow-up with the VP, Anna, next week.”

Voice technology uses automated speech recognition (ASR) to detect his voice command. Once it recognizes that Hans has spoken, it returns a transcription of what Hans said to his device; then it classifies the intent of his comment, determining if it was a question or a command. At that point, it extracts and classifies any named entities (for example, people, places, things) and uses search to generate candidates for the entities, making recommendations from candidates against your intent through natural language understanding (NLU).

Let’s take a closer look at this process step by step:

  • When Hans begins speaking into his device, the computer identifies and processes the human voice. It takes frames of audio from his voice and passes it through an automatic speech recognition model. ASR is a deep-learning sequence model that is taking the frames of audio and converting them into sequences of text. This process continues to repeat itself until Hans is done recording his note.
  • Next, with natural language understanding, a subset of natural language processing, the model takes these unstructured data inputs and converts them into a structured form that a machine can then understand and act upon. So, in Hans’s note above, NLU goes through the following steps.
    • Classifies Hans’s intent from the note
      • Hey Einstein, take a note (create a memo). It looks like the opportunity (update an opportunity) with Relativity Stores is bigger than I thought. It's probably worth about $500,000. I need to follow-up (set a reminder) with the VP, Anna, next week.”
    • Recognizes the different entities from the note
      • “Hey Einstein, take a note. It looks like the opportunity with Relativity Stores (account) is bigger than I thought. It's probably worth about $500,000 (currency). I need to follow-up with the VP, Anna (contact), next week (date).”
  • Maps the intents and entities to actions and field updates in Salesforce

Image of meeting debrief note and corresponding Salesforce updates based on Einstein’s Analysis.

So what just happened? Einstein automatically transformed the unstructured data of Hans’s note into structured data in Salesforce. From just a short dictation, Hans was able to log the note, create a task, and update a field on the related opportunity. It goes beyond the opportunity record; Salesforce admins can deploy this and help map unstructured data to any object in Salesforce.

While you won't need to understand the specifics of voice technology in order to get started using Einstein Voice, it’s important to understand the basics of how voice computing works. As you can see, there are a lot of technical components that go into it, but what’s most important to remember is that voice technology is about taking what you say, and interpreting it so that it’s understandable and actionable for a machine. It does this using two main types of technology: automatic speech recognition and natural language understanding.

Next Steps

Become an AI Trailblazer with our  Salesforce Einstein Trailmix and start thinking about how voice technology can impact you and your business. You can learn how to start with some basic Salesforce voice integrations now with the  Innovate with Alexa and Amazon Web Services trail.

retargeting