Skip to main content

Process Content for Use with AI

Learning Objectives

After completing this unit, you’ll be able to:

  • Identify the role of Document AI in extracting key details from unstructured data.
  • Describe how unstructured data enhances context for Agentforce and other AI applications.
  • Describe how organizations can use Document AI to act on unstructured knowledge.

Before You Start

This badge is part of the Data Cloud: Explore Setup to Activation trail. The trail is designed to give you hands-on experience with the core functionalities of Data Cloud.

In this badge, you learn how Data Cloud processes content, particularly unstructured content. Structured data, such as account details or transaction history, has long been the backbone of CRM systems. However, unstructured data, such as articles, chat transcripts, case notes, and email, represents a vast knowledge base now available to organizations for use with AI applications to deepen context and improve customer engagement.

Agentforce and Data Cloud

Imagine a support agent asking Agentforce, “What troubleshooting steps should I suggest for this customer’s error code?” Instead of scanning dozens of knowledge articles or case notes, Agentforce retrieves the right content from Data Cloud, grounds the AI response with that verified knowledge, and instantly generates a clear, contextual answer. The result is faster case resolution, less manual searching, and more accurate customer support.

Unstructured Data in Data Cloud: Lay the Foundation for Agentforce

Unstructured data refers to information that doesn’t follow a consistent format and so can’t be easily placed into a relational database, fields tables, or spreadsheets. This type of data comes in many forms: text, images, audio, videos, email, chat transcripts, PDFs, knowledge articles, case notes, email exchanges, support chat transcripts, legal documents, and more. While it’s harder to organize, it’s rich with context such as customer feedback, sentiment, tone, and insights that structured data simply cannot capture.

Unstructured data requires a different approach than structured data. With Data Cloud, you can ingest content through prebuilt connectors, and create search index configurations to convert content like text, images, or tables into chunks and numerical vectors. Data Cloud indexes these chunks and vectors for rapid search and retrieval by AI applications, analytics dashboards, and workflow automations.

Document AI is one tool to use to process unstructured (or semi-structured) data. Using Document AI, you can extract key details from PDFs, invoices, contracts, and reports into data lake objects (DLOs), which represent schemas of the extracted data.

Essentially, Document AI identifies fields such as customer name, dates, addresses, and dollar amounts and converts these into structured, searchable data in Data Cloud. This makes the content ready for use in AI prompts, analytics, and automation. For example, a Service Cloud admin for a hotel chain could use Document AI to extract guest feedback from survey forms, and then assign an AI agent to create customer records and follow-ups for future opportunities based on that extracted data.

Use Case: Bring It Together

Get Cloudy Consulting is a global consulting firm that relies on thousands of proposals, research reports, and client presentations stored across regions and practices to do business. A consultant preparing for a client meeting needs quick insights into past recommendations for digital transformation in the retail sector.

Since the consultant’s team previously used Document AI to extract frameworks, case studies, and benchmarks from unstructured decks and reports, the consultant can simply query Agentforce and retrieve the most relevant recommendations and supporting data points for the client.

Instead of combing through folders or relying on memory, the consultant gets accurate, context-rich insights instantly. Now they’re ready to deliver a personalized, high-impact client presentation.

By transforming unstructured content into structured knowledge, Get Cloudy Consulting empowers their teams to deliver faster, more personalized, and trustworthy experiences.

In this unit, you learned how Document AI makes unstructured data usable in Data Cloud. By extracting and structuring insights from diverse formats, Data Cloud equips AI applications like Agentforce with trustworthy information that supports faster, more accurate responses. Next, explore how this foundation powers retrieval augmented generation (RAG) to ground AI outputs in verifiable data.

Resources

Share your Trailhead feedback over on Salesforce Help.

We'd love to hear about your experience with Trailhead - you can now access the new feedback form anytime from the Salesforce Help site.

Learn More Continue to Share Feedback