Skip to main content

Prepare Your Data for Classification Apps

Learning Objectives

After completing this unit, you’ll be able to:

  • Prepare your data for Einstein classification apps.
  • Identify considerations for the classification apps.

Gather and Review Closed Case Data

For each classification app, Einstein builds a predictive model based on your closed-case data. Einstein Case Classification learns from field data, and Einstein Case Wrap-Up learns from chat transcripts.


Because Einstein Case Wrap-Up depends on Chat, Maria turns on Chat for her org. Because Ursa Major’s initial closed cases lack chat conversations, the accuracy of Einstein’s recommendations will improve as Ursa Major service agents close cases that include chat conversations.


For Einstein to accurately predict new case field values, someone at Ursa Major Solar needs to verify that field values on closed cases are accurate. Reviewing closed cases for accuracy takes time—a lot of time. But the more time spent auditing field values on closed cases, the more accurate field predictions will be. 


As an admin—not a support agent or manager—Maria has to ask for help. She needs a service expert to verify and confirm that Ursa Major Solar’s closed cases contain clean data. She turns to Ryan De Lyon, a friendly customer service manager located in the Phoenix office.    

Maria and Ryan De Lyon chatting in the office.

Maria fills Ryan in on her data needs and he’s happy to help. Auditing existing closed-case data is an investment in his service team’s productivity—the more effort he puts in, the better the predictions. With Maria’s guidance, he lays out the audit process. 

  1. Identify the most useful case fields to predict. Exclude fields that change over the life of a case, such as Case Status. For now, Ryan chooses Priority and Case Reason, because predicting their values saves agents time and doesn’t require their attention.
  2. Exclude from the audit any closed cases that don’t have Priority and Case Reason values.
  3. Export 1,000 closed cases from Salesforce to a spreadsheet or CSV file for a faster review of data.
  4. Review the following fields on the exported cases: Subject, Description, Priority, Case Reason, and any field that will be used in a filter to narrow the scope of the predictive model—for example, Type.

As Ryan audits closed cases, he also looks for data design issues to fix.

Verify That You Have the Right Data

Because the accuracy of closed-case data is crucial to building an effective predictive model, it’s important to make sure that the fields you want to predict are populated and correct. Even when they’re based on a large volume of data, predictions might be inaccurate if the data contains incomplete or incorrect values.


Because Ursa Major Solar is training a predictive model from the text in cases, that text must contain the right information—words or phrases—to classify cases correctly. While auditing closed cases, Ryan keeps these things in mind.

  • Adjust past data so that the model is built with the best, most accurate data.
  • Make sure that both customers and support agents include distinct information to classify cases. If humans can’t classify cases, neither can Einstein.
  • If you have fields with similar names or field values, consider unifying them into one field or value for clarity. For example, if values for Case Reason include Return, Return Issue, Return Slip, and Return Tracking, combine them. If support agents have trouble determining the correct field or value, the predictive model will have trouble determining the best recommendation.
  • If you have overloaded fields, consider separating them into distinct field values. Change a  catch-all field value into a more specific set of values. For example, change a value like Returns to Defect Returns, Gift Returns, and Refund Returns.

Get the Best Results

Along with auditing closed case data and verifying that you have the right data, there are a few more things to keep in mind to get the best results from your predictive model.

Case Volume

To accurately predict values on case fields, Einstein needs lots of closed cases to learn from. “Closed cases" means all closed cases that were created in the past 6 months. Encrypted fields cannot be used to build your predictive model, so the case title and description must not be encrypted.

  • To build a predictive model, Einstein needs at least 400 closed cases, but 1,000 or more is ideal. If you add filter criteria to limit which cases Einstein learns from, Einstein counts only the number of closed cases that match your criteria.
  • To predict a field’s value, Einstein needs at least 400 closed cases with a value in that field.

Ethics and Artificial Intelligence

Maria also wants to consider the ethical implications of using artificial intelligence with Ursa Major’s existing datasets. Before she enables Einstein Case Classification, she asks herself these questions. 

  • Are you aware of any dataset biases?
    When data is oversimplified or incorrectly labeled or categorized, measurement bias occurs. Measurement bias can be introduced when a person makes a mistake labeling data, or through machine error. A characteristic, factor, or group can be over or underrepresented in your dataset.
  • Are you including diverse participants in designing the dataset?
    Before someone starts building a system, they often make assumptions about what they should build, who they should build for, and how it should work– including what kind of data to collect from whom. This doesn’t mean that the creators of a system have bad intentions, but as humans, we can’t always understand everyone else’s experiences or predict how a system will impact others.

As Ryan begins reviewing and adjusting closed case data to help build the best predictive model possible, Maria gets ready to implement Einstein classification apps.  

Resources

Keep learning for
free!
Sign up for an account to continue.
What’s in it for you?
  • Get personalized recommendations for your career goals
  • Practice your skills with hands-on challenges and quizzes
  • Track and share your progress with employers
  • Connect to mentorship and career opportunities