Define the Agent Guardrails

Learning Objectives

After completing this unit, you’ll be able to:

Describe the guardrails that help ensure the trustworthiness of Agentforce.
Identify potential risks associated with your Agentforce project.
Define risk mitigation strategies for the project.

Trailcast

If you'd like to listen to an audio recording of this module, please use the player below. When you’re finished listening to this recording, remember to come back to each unit, check out the resources, and complete the associated assessments.

The Risks of Autonomous AI

Autonomous AI agents are incredibly powerful tools that can provide value to your organization and enhance the customer experience. But they also come with risks. These risks include security threats, data breaches, reputational harm, financial loss, bias, hallucinations, and issues with transparency and accountability.

Despite the risks, it’s possible to safely deploy autonomous AI in your organization. With proper planning and the help of the Salesforce Platform, you can build and implement an entire suite of trustworthy AI agents.

AI Agents You Can Trust

In Trusted Agentic AI, you learn that one of the standout features of Agentforce is its emphasis on guardrails. These guardrails define the operational boundaries for each agent, outlining what it can and can’t do. Using natural language, you can specify guidelines for the agent’s behavior and how it works.

In addition to the AI agent’s guardrails, the Salesforce Platform’s built-in Einstein Trust Layer ensures that the agent’s actions align with your company’s security and compliance standards. It includes mechanisms for harm and toxicity detection, preventing the agents from engaging in inappropriate or harmful activities.

In this unit, follow along with Nora as she works with Coral Cloud’s AI council to identify the risks associated with their autonomous AI use case and develop a plan for addressing those risks.

Don’t Forget About Governance

The Salesforce guardrails for AI agents are powerful and robust, but Nora is aware that not all of Coral Cloud’s guardrails are located in the technology itself. What happens outside of the technology is just as important.

When Coral Cloud developed its AI strategy, the team established a practice of AI governance, which helps them plan a comprehensive risk mitigation strategy for their AI agent.

Here’s an example of AI governance: At Coral Cloud, the AI council requires that all new AI projects undergo a safety review. It’s a business process (not a technology feature) that helps protect the organization from AI risks. Nora schedules the safety review so the company can start thinking more deeply about the project’s guardrails and governance.

Overcome the Objections

In some organizations, it can be tricky to approach the subject of risk because there’s a perception that risk mitigation activities slow down the development process. However, it's critical to address risk upfront or else the AI project can get shut down before it ever makes it into production.

By integrating risk management into the design and prototyping of AI, you can ‌accelerate your projects and ensure they meet necessary ethical, legal, regulatory, and security requirements. If you learn how to manage risk for one use case, you can then promptly extrapolate those lessons to the next use case and the next and the next.

Explaining in less technical terms what an agent is and what it can do for your business can help you build a solid foundation. This foundation can be applied to future projects, ensuring smoother and more successful AI implementations down the road.

How to Frame Conversations About Risk

So how do you approach conversations about risk? We recommend using the People, Business, Technology, and Data framework, which is likely familiar to many organizations. These categories and considerations can help you come up with possible risks and concerns related to your Agentforce project.

Category	Considerations
People	Empowerment: Roles and responsibilities, hiring, training, and upskilling Culture and practice: Human-centered AI design, change management, adoption
Business	Value: Benefits, objectives, KPI, and metrics Operations: Org structure, capability management, processes and workflows, AI governance, DevOps strategy
Technology	AI tooling: AI infrastructure, applications, APIs, prompts, security safeguards AI models: Model selection, training considerations, management, cost
Data	Quality: Fit for use, accuracy, completeness, accessibility, recency, and more Strategy: Data management, infrastructure, governance, analytics

An infographic showing the four quadrants of the People, Business, Technology, and Data framework.

Identify Risks and Concerns

Nora uses this framework to discuss risks and concerns related to Coral Cloud’s reservation management use case. Stakeholders from the Coral Cloud AI council identify risks and concerns for each category. Note that this list isn’t exhaustive, and every use case involves its own unique risks and concerns.

Category	Risks
People	Rejection: Customers don’t want to talk to the agent because they don’t trust it or because they’re unsure if they’re allowed to use AI. Abuse: Customers are hostile to the agent or try to manipulate it. Culture: Fears about the potential impact of AI on service jobs affect employee morale.
Business	Fit: Agent’s scope doesn’t fit properly into the business organization or team processes. Reporting: Current team KPIs are invalidated by the introduction of an AI agent to do some of the work. Incentives: Compensation and reward structures are impacted by agent work redirection. Operations: Process for escalation is unclear, inefficient, or frustrating. Agent Performance: Appropriate company policies don’t correctly influence the AI agent responses.
Technology	Accuracy: Hallucinations degrade the quality of responses or the knowledge is incomplete. Reliability: Variability of the agent’s generated responses is too broad. Audit: Technology operations can’t track the accuracy of agent responses. Latency: Agent can’t achieve timely responses.
Data	Access: Data permissions aren't understood or enforced; data might be exposed to customers. Privacy: The required data can’t be used according to the privacy policy. Compliance: It’s unclear if any customer contractual constraints apply to the data; for example, maybe data can’t leave the customer’s business country. Fit for purpose: Data isn’t aligned with the agent’s objective, or data rights aren’t aligned to the use case. Ethics: Bias in model data could generate inappropriate responses.

In Nora’s case, you can see how limitations in her company’s knowledge articles could shape her adoption strategy. But Coral Cloud could also experience many of the risks outlined. After all, if an agent isn’t following the resort’s policies in its responses or human employees have no way to track how helpful agents are, it won’t be easy to deliver a five-star experience.

Define Risk Mitigation Strategies

Now that the Coral Cloud AI council has cataloged the risks and concerns, Nora and her team can brainstorm mitigation strategies for each risk. As they come up with potential guardrails, they categorize each guardrail to designate whether it’s related to people, business, technology, or data.

Here are examples of potential guardrails for two of the risks that Coral Cloud has identified.

Risk Category	Risk	Potential Guardrails
People	Customer Rejection: Users don’t want to talk to the agent because they don’t trust it.	People guardrail: Create a communication strategy and conduct education briefings for customers. Technology guardrail: Design the agent to be transparent about the fact that it’s AI. Technology guardrail: Configure a welcome message for the agent that sets the right expectations about its capabilities and how it can assist.
Business	Escalation Issues: Handoffs from the agent to service reps are inconsistent, inefficient, or frustrating to customers.	Business guardrail: Define the criteria and context for escalation from AI to service reps. Technology guardrail: Configure Agentforce so that a summary of the agent’s prior interaction is handed off to the service rep. Technology guardrail: In the agent’s instructions, clearly describe any keywords, language, or requests that should trigger escalation.

Nora’s already got a plan for narrowing the scope of her reservation management implementation. But she can also take steps to set the right expectations for customers who ask an agent about the business. In this case, that might mean adding a disclaimer to the agent’s welcome message stating that it’s designed to answer reservation-related questions. And include a pointer to the best place for information on other services.

Document and Experiment

When Nora and the AI council are done with the risk mitigation exercise, they document the risks and guardrails for their use case. Capturing Coral Cloud’s risk mitigation activities is important for regulatory compliance and useful for internal audits.

Keep in mind that iteration is just as important as documentation. To make sure your technology guardrails are effective, dive into your sandbox environment and try configuring the safeguards in Agentforce. Get hands-on and test how the guardrails perform in different scenarios. That approach helps you identify any gaps or issues early on and make the necessary adjustments. By combining documentation with practical experimentation, you can develop ‌risk mitigation strategies for your AI agent.

With a preliminary governance plan in place, Nora is ready to move on to another essential component of the project: describing the work that Coral Cloud’s agent will do.

Time Estimate

Topics

Looking for Help?

Agentforce Resources