Skip to main content

Upgrade AI with Real-World Data

Learning Objectives

After completing this unit, you’ll be able to:

  • Describe the benefits of using Agentforce Data Library.
  • Define four key concepts involved in AI data transformation and organization.
  • Explain how the Agentforce Data Library setup and runtime processes work.

Before You Start

Before you start this module, consider completing this recommended content.

Why Ground AI with Data?

Your data plays a crucial role in ensuring that AI systems operate accurately and effectively. Give a customer the wrong answer, and it could discourage them from future purchases. Provide service reps with incorrect information, and they might frustrate customers rather than support them. Deliver outdated recommendations to your sales reps, and they could miss their earnings targets and lose valuable business opportunities.

While data is the backbone of any successful AI system, AI models are born generalists: They train on massive datasets that provide them with a broad base of knowledge. This broad knowledge base means they don’t have the specialized information needed to perform specific tasks or answer technical questions for your unique use cases.

Real-world data grounding takes AI models beyond their static training sets. When you ground your AI model in verified sources of information like your Salesforce knowledge base, your uploaded files, or websites, the LLM can more accurately return responses to customer inquiries, suggest better replies to agents, provide sophisticated search summaries, and more.

A diagram with symbols for knowledge, files and web search, pointing to another bubble titled AI grounding, that leads to a chat window with an Agentforce symbol.

The Challenge of Enterprise Data

Most companies store their knowledge bases in unstructured formats such as collections of videos, images, documents, emails, sensor data, social posts, and audio files, and more—formats that don’t fit neatly into spreadsheets or databases. Accounting for nearly 90% of enterprise data, this data is harder to search, but it’s packed with valuable insights such as customer feedback, perceptions, opinions, tone, and sentiment. So how can you unlock this data’s potential?

Enter Agentforce Data Library, a powerful tool that can help you ground AI in your real-world data. With the Agentforce Data Library, you can easily connect your knowledge base to Salesforce AI features, ensuring you get up-to-date, AI-generated content that’s tailored to your organization and use cases. When you set up an Agentforce Data Library, you get the tools you need to transform large sets of unstructured or semi-structured data into more useful, searchable content. Let’s see how.

Transform Data for Efficient Use with Large Language Models

Agentforce Data Libraries make it easy to link agents and large language models (LLMs) to your unstructured data by automating several configuration steps across Data Cloud and Prompt Builder. This includes pushing data streams to Data Cloud, mapping data objects, and creating a search index and retriever. The end result is that your AI tools are always working with the most up-to-date and relevant information.

Before you learn the simple steps to set up a data library, let’s review a few key concepts: grounding, chunking, indexing, and retrievers.

Grounding

Grounding is when you add domain-specific knowledge or customer information to a prompt, giving the LLM the context it needs to respond more accurately to a question or task. As we’ve mentioned, your grounding sources can include knowledge articles, uploaded files, websites, conversation transcripts, and more. However, lengthy and complex documents can be time-consuming and resource-intensive to search through, and LLMs have a maximum token or word limit for the amount of text they can process at one time.

Chunking and Indexing

To address this, data sources are broken down into smaller parts, called chunks. We then search through the chunks and only return the most relevant pieces of information for the LLM to consider.

A diagram of the chunking and indexing process

Once the data is chunked, it’s organized and categorized into a search index. Storing information in an organized search index makes it easier and faster to retrieve specific data when needed. When data is chunked and organized like this, it makes searches more efficient, improves the relevance of results, and enables the handling of very large datasets.

Think of a large online store with millions of products. A well-organized store catalog or website taxonomy enables customers to quickly find the products they’re looking for across categories like name, type, brand, or even specific features. Breaking up data into smaller pieces and organizing them in a search index is like creating a catalog for your content. LLMs can then use this catalog or index to find the right information to answer users’ queries.

Retrievers

Retrievers act as pointers between data and features. They are designed to automatically extract and provide relevant data from different databases, systems, or platforms. When a user asks a question, the retriever assigned to each data library determines which datasets in Data Cloud that Salesforce AI tools can access. This makes retrievers particularly important in applications like search engines, question-answering systems, and recommendation systems.

You just learned some basics about data organization in AI. Next, let’s see how these processes play out during data library setup and runtime.

What Happens at Setup?

When you create a data library, the processes that connect your data with your AI agents and features begin immediately. First, a data stream is created, followed by the data lake and data model objects. These objects are then mapped together, and data chunking begins. The time required for chunking varies based on factors such as the number, size, and complexity of knowledge articles or uploaded files, and the number of knowledge fields selected for chunking. After chunking is completed and the search index is ready, a retriever is created. Each Agentforce Data Library has its own unique retriever, which can point to the same search index but operates independently.

A diagram of the Agentforce Data Library process

What Happens at Runtime?

Once the retriever is set up and the search index is fully prepared, the system is ready to handle user queries at runtime.

At runtime, the user’s query is added to the prompt template, which references the retriever connecting to relevant data. The system then searches through the search index to find the most pertinent information and incorporates it back into the prompt. The LLM receives this enriched prompt, which includes the user’s query, the added information, and the prompt instructions, and then generates a response. The Service Planner reviews this response to ensure it aligns with the prompt instructions. Finally, the end user receives a response that accurately answers the query and is contextualized with relevant, domain-specific information tailored to the specific task.

A flowchart showing the runtime process

Let’s Recap

Great job! In this unit, you learned why it’s important to ground AI with your data, and you explored some specialized terminology and technical processes. Now it’s time to get to the setup, where you can see just how simple it can be!

Resources

Condividi il tuo feedback su Trailhead dalla Guida di Salesforce.

Conoscere la tua esperienza su Trailhead è importante per noi. Ora puoi accedere al modulo per l'invio di feedback in qualsiasi momento dal sito della Guida di Salesforce.

Scopri di più Continua a condividere il tuo feedback