Skip to main content

Promote Responsible and Ethical Agents

Learning Objectives

After completing this unit, you’ll be able to:

  • Implement ethical red-teaming and testing strategies.
  • Develop guiding principles and standards for your organization.

Guiding Principles for Responsible Agents

Many organizations adopting AI find it helpful to establish responsible AI principles before developing their AI strategy. With a set of AI principles, businesses can clarify their position on AI and consider the technology’s impact on employees, customers, and society. You can use our guidelines as inspiration for developing your own based on business needs and use cases. Think about what safety means for your use case. Do you have specific laws, rules, and regulations for your industry that can require specific safety requirements?

As a refresher, and to get you started establishing your own guiding principles, here’s the Salesforce guiding principle for developing trusted AI.

  • Accuracy
  • Safety
  • Honesty
  • Empowerment
  • Sustainability

Red-Teaming and Testing Strategies

A red team is a group of experts, usually security and AI ethics experts, who try to get into an organization's systems to find and fix security and other problems that involve undesirable outputs or outcomes.

Red-teaming can be defined as “a structured process for probing AI systems and products for the identification of harmful capabilities, outputs, or infrastructural threats.”

Three hands holding magnifying glasses focus on a warning sign with an exclamation mark inside a triangle.

Testing is a key aspect of ensuring safety and preventing unintended consequences. ‌Here are some key things to think about.

  • Understand what harms you want to test for. Set the goals and objectives for testing and align them with your business goals and use case.
  • Build the team to perform these tests. You can use both internal and external experts who are well-versed and have experience in adversarial thinking and creating attack strategies to test.
  • Test periodically to make sure you're keeping up with the evolving technology and adversarial thinking around AI and agents.

Here’s how we approach red-teaming at Salesforce. Salesforce uses both manual and automated red-teaming methods to make our AI products safer. We test for bad use, intentional integrity attacks like quick injections, or accidental misuse. We conduct ‌AI red-teaming for toxicity, bias, and security to make sure that if any malicious use or benign misuse occurs, our systems are safe.

Type of Testing

Description

Manual

Manual testing uses the creativity, experience, and specialized knowledge of human testers to craft complex attack strategies that automated systems can miss. Human testers can also adapt their approach to the specific environment, target, and goals, making their attacks more realistic and tailored.

Automated

Automated testing is used as an enhancement, not a replacement for human-driven testing and evaluation. This type of testing uses scripts, algorithms, and software tools to simulate many attacks or threats in a short time. It also explores the risk surface of the system by looking at the amount of risk.

We engage with external and internal experts to perform penetration tests and address the unique risks and use cases of agents.

To get a more comprehensive overview, check out our responsible red-teaming blog.

Model Benchmarking

By comparing our AI models against industry standards, we can make sure that they perform at the highest level. We made this even better by publishing the first LLM Benchmarks for CRM. These benchmarks share important measures that help us understand how well an AI system works and also inform our customers.

The Future of Ethical Testing

The testing, evaluation, and assessment team at Salesforce is dedicated to ensuring the trust and safety of our AI products. Through rigorous testing processes, proactive red-teaming, and comprehensive benchmarking, we strive to maintain the highest standards of AI integrity. By fostering a culture of continuous improvement and innovation, we're committed to delivering AI solutions that our customers can trust.

AI Acceptable Use Policy

Salesforce has published an AI Acceptable Use Policy (AI AUP) to align with industry standards and our partners, and to protect our customers. You can learn more by reviewing our AI Acceptable Use Policy.

The Salesforce AI AUP is central to our business strategy, which is why we took the time to consult with our Ethical Use Advisory Council subcommittee, partners, industry leaders, and developers before its release. In doing so, we aim to entrust responsible innovation and protect the people who trust our products as they are developed. The Salesforce AI AUP is just a starting point, focusing on use of AI with Salesforce products. Think about making your own AI rules or principles to make sure your company uses AI in a way that respects your company’s ethical values.

Agent Security Standards

Consider these security measures to develop security standards for access control, data protection, and responsible use of agents in your organization.

Category

Type

Recommendation

Access Control

Strict Access Controls

Implement appropriate access controls to ensure that only individuals with a need to know and business requirements are authorized to interact with generative AI models and services.

When designing agents, comprehensively identify the agent’s entire scope and potential actions to determine appropriate execution contexts. For critical actions, consider running agents within individual service user contexts to implement granular access controls and minimize potential security risks.

Monitoring and Auditing

Create alerts and regularly monitor and audit access to generative AI models and services to detect and prevent unauthorized use.

Data Protection

Integrity Controls

Add integrity controls for both internal and customer data. Follow the right rules for application security, backup and restore, and basic configurations.

Responsible Use

Customer Data Handling

Take steps to handle Customer Personal Data correctly. Make sure it is only collected and used for legitimate reasons and that data subjects are given the right notice and consent.

Customer Transparency

Ensure services don’t perform inferences invisible to your customer.

Content Moderation

Provide a content moderation filter over generative AI services, and enable it by default where available.

Ethical Use

Establish guidelines for the ethical use of generative AI to ensure it’s used in a manner that respects privacy and security.

From Theory to Practice

We covered a lot of ground about trusted agentic AI in this module and how Salesforce develops trusted agentic AI. Now you understand the key risks associated with agentic AI, such as unexpected behavior, bias, and data breaches. You also learned about the specific guardrails and trust patterns that make sure AI agents operate within safe and ethical parameters. You understand the importance of fostering responsible AI practices in your own organization with ethical red-teaming, testing, and the establishment of an AI Acceptable Use Policy.

With this knowledge, you’re well on your way to creating AI agents that aren't only effective but also trustworthy and responsible!

Resources

Share your Trailhead feedback over on Salesforce Help.

We'd love to hear about your experience with Trailhead - you can now access the new feedback form anytime from the Salesforce Help site.

Learn More Continue to Share Feedback