Explore Agentforce Guardrails and Trust Patterns

Learning Objectives

After completing this unit, you’ll be able to:

Describe Platform guardrails.
Describe Agentforce guardrails.
Explain how to customize Agentforce guardrails.
Explain the trust patterns used to build agents at Salesforce.

Ensure Trust with Guardrails

AI is moving fast. And with such rapid change, it’s natural to feel some anxiety. At Salesforce, our product team and Office of Ethical and Humane Use (OEHU) recognize that maintaining trust in our products is imperative and are tackling the agentic AI risks and concerns by:

Identifying the necessary controls to build a trusted agent
Building a ‌testing strategy
Adding in-product ethical guardrails
Providing better ethical guidance to our customers

Building these guardrails into our products and giving clear ethical guidelines help companies handle AI technology responsibly, keeping things safe and reliable for everyone.

Let’s get into some details, starting with the platform guardrails, which provide global controls across our products.

Platform Guardrails

Salesforce includes a comprehensive set of policies, guidelines, and protocols designed to ensure the safe, ethical, and compliant operation of the platform. These guardrails include:

Acceptable Use Policy (AUP): General rules for customer use of Salesforce services, prohibiting activities that could harm the platform or its users.
AI Acceptable Use Policy (AI AUP): Specific rules for customer use of Salesforce AI technologies, ensuring our products are used in a responsible way.
Model Containment Policies: Clear rules for how AI models are used. These rules make sure that AI models are used within certain limits to prevent them from being misused or having unintended effects.

These controls create a ‌framework that maintains the platform's integrity, security, and ethical standards. For example, the AUP states that you can't use the platform to spam or phish. The AI AUP states that AI can’t make legal or important decisions without a human making the final decision. Model containment policies can limit the types of data an AI model can access to help prevent data leakage or misuse.

Agentforce Guardrails

Agentforce guardrails are a set of rules, guidelines, and best practices that are made for a specific Salesforce Cloud or Product, business use, and to make sure agents follow local laws and standards. Agentforce includes ethical guardrails to minimize AI hallucinations and security guardrails to prevent threats and malicious attacks, such as prompt injections.

Three overlapping orange circles labeled Agent Type, Topics and Topic Instructions, and Actions, with a shield labeled Trust in the center.

Agent Type

Salesforce provides out-of-the-box agents for specific clouds and common use cases. Different agent types can have their own settings and guardrails to define agent behavior. For example, the Agentforce Service Agent (ASA) type uses topic instructions to determine when to escalate a conversation from the AI agent to a human representative. The Sales Development Rep (SDR) agent type has admin-defined engagement rules for the conditions when the agent can start working on the lead and how and when agent emails can be sent.

Topic, Topic Instructions, and Actions

Each agent includes a set of prebuilt topics and actions.

Topics are a category of actions related to a particular job to be done by agents. Topics contain actions, which are the tools available for the job, and instructions, which tell the agent how to make decisions. Together, topics define the range of capabilities your agent can handle. Salesforce provides a library of standard topics for common use cases.

Topic instructions set guidelines for agent behavior, providing the context needed to perform their jobs effectively. Each topic is a category of actions related to a specific job, containing actions (tools) and instructions (decision-making guidelines). Instructions help agents make decisions about how to use the actions in a topic for different use cases. These instructions are typically phrased as “Always…”, “Never…”, “If x, then y…”, or “As a first step,…” to ensure clear and consistent behavior.

Actions are how agents get things done. Agents include a library of actions, which is a set of jobs an agent can do. For example, if a user asks an agent for help with writing an email, the agent launches an action that drafts and revises the email and grounds it in relevant Salesforce data. Salesforce provides some actions out of the box, and these actions are called standard actions. The benefit of including standard topics and actions by default is that your agent is ready to help users with many common tasks right away.

Customize Guardrails

For more granular control, use agent topic instructions to create boundaries, set context, and define agent behavior. You can modify the instructions for a standard agent topic, or you can create a custom topic from scratch.

These guardrails are controlled by your admin and typically signed off by your internal leadership or key decision-makers. This ensures that the guidelines are authoritative and reflect your organization's values and compliance requirements.

Einstein Trust Layer

AI agents are integrated with the Einstein Trust Layer, which is a secure AI architecture natively built into Salesforce.

Designed for enterprise security standards, the Trust Layer lets you benefit from generative AI without compromising your customer data. It also lets you use trusted data to improve generative AI responses.

Data grounding: The Trust Layer ensures that generative prompts are grounded and enriched in trusted company data.
Zero-data retention: Your data is never retained by a third-party LLM provider.
Toxicity detection: Potentially harmful LLM responses are detected and flagged.
AI monitoring: AI interactions are captured in event logs, giving you visibility into the results of each user interaction.

Trust Patterns of Agents

Across our products, we implement several key trust patterns, standard product designs to improve safety. Here are a few examples.

Trust Pattern	Example
Reduce hallucinations.	We use topic classification to map user inputs to specific topics. This reduces the risk of an agent generating incorrect or irrelevant information.
Limit the frequency of agent-generated emails.	We limit the frequency of agent-generated emails to prevent overwhelming users and make sure that communications are meaningful.
Respect user privacy.	We include an opt-out feature in the CRM software, allowing users to control how often they receive communications from AI agents.
Create transparency by design.	We make sure that AI-generated content is directly and transparently disclosed.
Facilitate smooth AI-human handoffs.	We facilitate smooth transitions from agents to humans. Examples include copying a sales manager on AI-generated emails, or providing a dashboard for human oversight.

Implementation Best Practices

When implementing Agentforce guardrails in your organization, follow these best practices.

Best Practice	Example
Understand the policies.	Create a list of the policies that are applicable to your industry, geography, and use case. Use these to set boundaries for what the agent can and can’t do. These can help determine what topics can be assigned to your agent.
Implement robust security measures.	Limit the access of the agents to what they need to complete the assigned tasks. Make sure the agents comply with data protection and regulatory requirements. Use topic instructions to set the rules that the agent must follow.
Facilitate human oversight.	Set clear guidelines for how and when to hand off to a human representative. Use topic instructions to state these guidelines.
Monitor and audit.	In addition to initial testing, continuous monitoring helps to make sure the agents are performing as designed. Use the Audit Trail feature in the Einstein Trust Layer to gain detailed insights into AI actions and outcomes.
Respect user privacy.	Use the opt-out feature to allow users to control communication frequency and protect their privacy.
Conduct regular assessments.	Regularly conduct bias, explainability, and robustness assessments to monitor ongoing safety and reliability.

Time Estimate

Topics

Looking for Help?

Agentforce Resources