이 뱃지에는 새로운 사용자 정의 Data Cloud Playground가 필요합니다

이 뱃지와 Data Cloud Playground가 필요한 다른 뱃지를 완료하는 데 제한된 시간이 주어집니다. 주어진 시간이 끝나면 이 Playground에 액세스할 수 없으며 처음부터 다시 시작해야 할 수도 있습니다.

Create a Search Index Configuration

Learning Objectives

After this unit, you’ll be able to:

Describe how search index configurations and grounding works in Data Cloud.
Create a vector search index configuration.

Ground Search on Unstructured Data with Search Index Configurations

Grounding search on unstructured and structured data enhances your use of generative AI, analytics, and automation tools across the Salesforce platform. Grounded search brings customer-specific data into applications like Agentforce, Tableau, and Flow Builder, ensuring that outputs are finely tuned to your users’ intents and contexts. This alignment results in more accurate and relevant AI-generated content, deeper insights from analytics, and more efficient automation workflows for your teams and customers.

To ground search, you must break your unstructured data into semantically appropriate chunks and from those chunks, create vector embeddings—numerical representations of your chunked data. The chunked content, stored in the Data Cloud search index, is searchable from and can be used in Einstein generative AI applications (Prompt Builder and Agentforce), and automation (Flow Builder) and analytics (Tableau) applications.

Chunk Unstructured Data

In the previous unit, we covered how Data Cloud references unstructured data through unstructured data model objects (UDMOs). You can also chunk UDMOs or any DMOs with text fields, such as Salesforce Knowledge articles. This is what you’ll do in this unit.

When you chunk UDMOs or DMOs, you break them down into manageable, semantically meaningful chunks. These units of text are stored in Data Cloud in chunk data model objects (CDMOs), which are created from data model objects or unstructured data model objects.

Understand How Chunking Works

Data Cloud supports several chunking strategies.

Semantic-based passage extraction uses the semantic meaning inherent in HTML tags to chunk a document into passages. HTML elements such as headings (<h1>, <h2>), lists (<ul>, <ol>), or bold text (<strong>) acting as a subheading are considered logical boundaries for passages.

Window-based passage extraction strategy uses block-level elements such as <div> and <p> tags, or raw text separated by line breaks to chunk documents into passages. If a paragraph doesn’t contain any HTML, the extraction is done at the sentence level.

Learn more about chunking strategies in Salesforce Help.

For now, let’s see what happens after your data is chunked.

Create Vector Embeddings from Chunked Content

After Data Cloud chunks your content, it creates a vector embedding—a numerical representation of the chunked content that can be retrieved or used in your Salesforce generative AI, automation, or analytics applications.

Vector embeddings are numerical representations of text that store relationships between words or phrases. The embedding captures the semantic meaning of the content, so chunks of content that are semantically similar have similar vector embeddings. These representations help machines to process and comprehend language effectively.

In Data Cloud, vector embeddings are referenced by index data model objects (IDMOs), which we take a closer look at later in this unit.

Read more about the vector embeddings and chunked content in the Salesforce Help.

Create Vector Search Index Configurations

To get your unstructured data ready for search, you need to chunk and vectorize it. To do so, you create a search index configuration. You want to create a search index configuration for any data objects with text fields that contain informational concepts, narratives, or detailed descriptions that your users search to find relevant results. An example of such data are Salesforce Knowledge articles or other text documents (like chat transcripts) stored in an external blob store like Amazon S3.

Create a Vector Search Index Configuration from Knowledge Articles

In the previous unit, you created a data stream and data lake object from the Knowledge bundle in the Salesforce CRM connector, which provides a handful of sample Knowledge articles.

The Knowledge Article Version object is useful to index, as you can use this object to query, retrieve, or search across all types of articles depending on their version. The Knowledge Article Version object includes these fields that should be indexed for search.

Name: The name or title of the Knowledge article
Description: The description or summary of the Knowledge article, mapped from Summary
Custom text fields: Any rich text fields (131K limit) that hold unstructured data

Create a Vector Search Index Configuration for the Knowledge Article Version DMO

You'll complete these steps in your Data Cloud org in order to pass the challenge at the end of this unit.

Advanced Setup gives you more control over chunking and vectorization choices, but for this challenge, you’ll mostly use the defaults.

If you haven’t already, launch your Data Cloud playground.
From App Launcher, select Data Cloud.
Click Search Index | New.
If you don’t see Search Index in the Data Cloud navigation, click the More drop down menu, and then select Search Index.
Click Advanced Setup | Next.
From the Select Source Object page, select Vector Search, the Knowledge Article Version DMO, and click Next.
On the Select Fields to Chunk Page, click Manage Fields.
Click Select All Fields, and click Save.
Leave the default chunking strategy, and click Next.
On the Select a Vectorization Strategy page, leave the default vectorization strategy, and click Next.
On the Select Related Fields for Search Filtering page, do not add any fields, and click Next.
On the Search Index Configuration Details page, replace the auto-generated Search Index Configuration Name with My_kav. (The Search Index Configuration API Name is automatically populated).
Click Save.

That’s it! Your new search index configuration, My_kav, is listed under the search index tab.

View the Knowledge Article Version CDMO and IDMOs

After you create a search index configuration, its status changes to Submitted and then to In-progress as it processes data from the source DMO/UDMO. If there are no failures, the status changes from Submitted to In Progress to Ready. You won’t see any records in Data Explorer until the search index status is Ready.

It can take several minutes for Data Cloud to process the data in the search index, but time can vary, so go grab a beverage or stretch your legs. When you come back, click Refresh and check if the search index status is Ready.

The most useful content in a Knowledge article is found in the Description field. Usually, the sample articles are small enough that there is just one chunk. This means that for each record in the Knowledge Article Version CDMO and IDMO, there's one chunk and one vector respectively, but lengthier content could have more records in each DMO.

Let’s take a quick look at the CDMO and IDMO we created for the Knowledge Article Version DMO.

Confirm that the search index status is Ready.
From Data Cloud, click Data Explorer.
From the Object drop-down menu, select Data Model Object.
From the Select an Object field, select My_kav chunk.
Now you should be able to view a list of all the chunks Data Cloud created from the sample Knowledge articles.
From the Select an Object field, select My_kav index.
Now you should be able to view a list of all the vector records Data Cloud created from the sample Knowledge articles.

You can use the CDMO and IDMOs contained in the search index throughout Salesforce in applications like Flow Builder, Agentforce, Prompt Builder, and even Tableau. Or check out the vector search docs to learn more about running vector search queries.

Connecting unstructured data to Data Cloud allows you to ground search results on a wealth of data for a variety of customer-focused use cases. By chunking and vectorizing that data, you can use vector search in Einstein generative AI applications, Flow Builder, and even Tableau to enhance your AI, analytics, and automation capabilities.

예상 시간

주제

도움말 검색

Data Cloud 자원

이 뱃지에는 새로운 사용자 정의 Data Cloud Playground가 필요합니다

Create a Search Index Configuration

Learning Objectives

Ground Search on Unstructured Data with Search Index Configurations

Chunk Unstructured Data

Understand How Chunking Works

Create Vector Embeddings from Chunked Content

Create Vector Search Index Configurations

Create a Vector Search Index Configuration from Knowledge Articles

Create a Vector Search Index Configuration for the Knowledge Article Version DMO

View the Knowledge Article Version CDMO and IDMOs

Resources

단계 확인

이 뱃지에는 새로운 사용자 정의 Data Cloud Playground가 필요합니다