Use Batch Apex
Learning Objectives
After completing this unit, you’ll know:
- Where to use batch Apex.
- The higher Apex limits when using batch Apex.
- Batch Apex syntax.
- Batch Apex best practices.
Follow Along with Trail Together
Want to follow along with an instructor as you work through this step? Take a look at this video, which is part of the Trail Together series on Trailhead Live.
(This clip starts at the 31:26 minute mark, in case you want to rewind and watch the beginning of the step again.)
Batch Apex
Batch Apex is used to run large jobs (think thousands or millions of records!) that would exceed normal processing limits. With batch Apex, you can process records asynchronously in batches (hence the name, “batch Apex”) to stay within platform limits. If you have a lot of records to process, such as for data cleansing or archiving, batch Apex is probably your best solution.
Here’s how batch Apex works under the hood. Let’s say you want to process 1 million records using batch Apex. The execution logic of the batch class is called one time for each batch of records that you process. Each time that you invoke a batch class, the job is placed on the Apex job queue and is executed as a discrete transaction. This functionality has two awesome advantages:
- Every transaction starts with a new set of governor limits, making it easier to ensure that your code stays within the governor execution limits.
- If one batch fails to process successfully, all other successful batch transactions aren’t rolled back.
Batch Apex Syntax
To write a batch Apex class, your class must implement the Database.Batchable interface and include the following three methods:
start()
This method is used to collect the records or objects to be passed to the execute() method for processing. The start() method is called once at the beginning of a batch Apex job and returns either a Database.QueryLocator object or an Iterable that contains the records or objects passed to the job.
Most of the time a QueryLocator does the trick with a simple SOQL query to generate the scope of objects in the batch job. But if you need to do something crazy like loop through the results of an API call or pre-process records before being passed to the execute method, see Custom Iterators.
With the QueryLocator object, the governor limit for the total number of records retrieved by SOQL queries is bypassed and you can query up to 50 million records. However, with an Iterable, the governor limit for the total number of records retrieved by SOQL queries is still enforced.
execute()
This method performs the actual processing for each chunk or “batch” of data passed to the method. The default batch size is 200 records. Batches of records are not guaranteed to execute in the order they are received from the start() method.
The execute() method takes the following parameters:
- A reference to the
Database.BatchableContextobject. - A list of sObjects, such as
List<sObject>, or a list of parameterized types. If you are using aDatabase.QueryLocator, use the returned list.
finish()
This method is used to execute post-processing operations (for example, sending an email) and is called once after all batches are processed.
Here’s what the skeleton of a batch Apex class looks like:
public class MyBatchClass implements Database.Batchable<sObject> {
public (Database.QueryLocator | Iterable<sObject>) start(Database.BatchableContext bc) {
// collect the batches of records or objects to be passed to execute
}
public void execute(Database.BatchableContext bc, List<P> records){
// process each batch of records
}
public void finish(Database.BatchableContext bc){
// execute any post-processing operations
}
}Invoking a Batch Class
To invoke a batch class, simply instantiate it and then call Database.executeBatch with the instance:
MyBatchClass myBatchObject = new MyBatchClass(); Id batchId = Database.executeBatch(myBatchObject);
You can also optionally pass a second scope parameter to specify the number of records that should be passed into the execute method for each batch. Pro tip: Limit this batch size if you are running into governor limits.
Id batchId = Database.executeBatch(myBatchObject, 100);
Each batch Apex invocation creates an AsyncApexJob record so that you can track the job’s progress. You can view the progress via SOQL or manage your job in the Apex job queue. We’ll talk about the job queue soon.
AsyncApexJob job = [SELECT Id, Status, JobItemsProcessed, TotalJobItems, NumberOfErrors FROM AsyncApexJob WHERE ID = :batchId ];
Using State in Batch Apex
Batch Apex is typically stateless. Each execution of a batch Apex job is considered a discrete transaction. For example, a batch Apex job that contains 1,000 records and uses the default batch size is considered five transactions of 200 records each.
If you specify Database.Stateful in the class definition, you can maintain state across all transactions. When using Database.Stateful, only instance member variables retain their values between transactions. Maintaining state is useful for counting or summarizing records as they’re processed. In the next example, we update contact records in our batch job. We want to keep track of the total records affected so that we can include it in the notification email.
Sample Batch Apex Code
Now that you know how to write a batch Apex class, let’s see a practical example. Let’s say you have a business requirement that states that all contacts for companies in the USA must have their parent company’s billing address as their mailing address. Unfortunately, users are entering new contacts without the correct addresses! Write a batch Apex class that ensures that this requirement is enforced.
The following sample class finds all account records that are passed in by the start() method using a QueryLocator and updates the associated contacts with their account’s mailing address. Finally, it sends an email with the results of the bulk job and, since we are using Database.Stateful to track state, the number of records updated.
public class UpdateContactAddresses implements
Database.Batchable<sObject>, Database.Stateful {
// instance member to retain state across transactions
public Integer recordsProcessed = 0;
public Database.QueryLocator start(Database.BatchableContext bc) {
return Database.getQueryLocator(
'SELECT ID, BillingStreet, BillingCity, BillingState, ' +
'BillingPostalCode, (SELECT ID, MailingStreet, MailingCity, ' +
'MailingState, MailingPostalCode FROM Contacts) FROM Account ' +
'Where BillingCountry = \'USA\''
);
}
public void execute(Database.BatchableContext bc, List<Account> scope){
// process each batch of records
List<Contact> contacts = new List<Contact>();
for (Account account : scope) {
for (Contact contact : account.contacts) {
contact.MailingStreet = account.BillingStreet;
contact.MailingCity = account.BillingCity;
contact.MailingState = account.BillingState;
contact.MailingPostalCode = account.BillingPostalCode;
// add contact to list to be updated
contacts.add(contact);
// increment the instance member counter
recordsProcessed = recordsProcessed + 1;
}
}
update contacts;
}
public void finish(Database.BatchableContext bc){
System.debug(recordsProcessed + ' records processed. Shazam!');
AsyncApexJob job = [SELECT Id, Status, NumberOfErrors,
JobItemsProcessed,
TotalJobItems, CreatedBy.Email
FROM AsyncApexJob
WHERE Id = :bc.getJobId()];
// call some utility to send email
EmailUtils.sendMessage(job, recordsProcessed);
}
}Here’s what’s going on in more detail:
- The
start()method provides the collection of all records that theexecute()method will process in individual batches. It returns the list of records to be processed by callingDatabase.getQueryLocatorwith a SOQL query. In this case we are simply querying for all Account records with a Billing Country of ‘USA’. - Each batch of 200 records is passed in the second parameter of the
execute()method. Theexecute()method sets each contact’s mailing address to the accounts’ billing address and incrementsrecordsProcessedto track the number of records processed. - When the job is complete, the finish method performs a query on the
AsyncApexJobobject (a table that lists information about batch jobs) to get the status of the job, the submitter’s email address, and some other information. It then sends a notification email to the job submitter that includes the job info and number of contacts updated.
Testing Batch Apex
Because Apex development and testing go hand in hand, here’s how we test the above batch class. In a nutshell, we insert some test records, call the batch Apex class, and then assert that the records were updated properly with the correct address.
@isTest
private class UpdateContactAddressesTest {
@testSetup
static void setup() {
List<Account> accounts = new List<Account>();
List<Contact> contacts = new List<Contact>();
// insert 10 accounts
for (Integer i=0;i<10;i++) {
accounts.add(new Account(name='Account '+i,
billingcity='New York', billingcountry='USA'));
}
insert accounts;
// find the account just inserted. add contact for each
for (Account account : [select id from account]) {
contacts.add(new Contact(firstname='first',
lastname='last', accountId=account.id));
}
insert contacts;
}
@isTest static void test() {
Test.startTest();
UpdateContactAddresses uca = new UpdateContactAddresses();
Id batchId = Database.executeBatch(uca);
Test.stopTest();
// after the testing stops, assert records were updated properly
System.assertEquals(10, [select count() from contact where MailingCity = 'New York']);
}
}The setup() method is annotated with @testSetup. This annotation allows you to create common test records that are available for all the methods in the class. Any updates to the test records are rolled back after each test method finishes executing. Although this test class includes only one method (besides the setup() method itself), we included the @testSetup annotation here as an example. In this case, the setup() method inserts 10 account records with the billing city of ‘New York’ and the billing country of ‘USA’. Then for each account, it creates an associated contact record. This data is used by the batch class.
In the test() method, the UpdateContactAddresses batch class is instantiated. Then, this instance of the batch class is passed to Database.executeBatch. The call to Database.executeBatch is included within the Test.startTest and Test.stopTest block, where all of the magic happens. The job executes after the call to Test.stopTest. Any asynchronous code included within Test.startTest and Test.stopTest is executed synchronously after Test.stopTest.
Finally, the test verifies that all contact records were updated correctly by checking that the number of contact records with the mailing city of ‘New York’ matches the number of records inserted (i.e., 10).
Best Practices
As with future methods, there are a few things to keep in mind when using batch Apex. To ensure fast execution of batch jobs, minimize web service callout times and tune queries used in your batch Apex code. The longer the batch job executes, the more likely that other queued jobs are delayed when many jobs are in the queue.
Best practices include:
- Only use batch Apex if you have more than one batch of records. If you don’t have enough records to run more than one batch, you are probably better off using queueable Apex.
- Tune any SOQL query to gather the records to execute as quickly as possible.
- Minimize the number of asynchronous requests created to minimize the chance of delays.
- Use extreme care if you are planning to invoke a batch job from a trigger. You must be able to guarantee that the trigger won’t add more batch jobs than the limit.
Resources
- Apex Developer Guide: Batch Apex
- Apex Developer Guide: Using Batch Apex
- Apex Developer Guide: Custom Iterators
- Apex Developer Guide: TestSetup Annotation
