Start tracking your progress
Trailhead Home
Trailhead Home

Explore Optimizations on AWS

Learning Objectives

After completing this unit, you will be able to:

  • Define availability.
  • Design a highly available application infrastructure.
  • Explain how to scale an application.
Note

Note

This module was produced in collaboration with Amazon Web Services (AWS), which owns, supports, and maintains the AWS products, services, and features described here. Use of AWS products, services, and features is governed by privacy policies and service agreements maintained by AWS.

Before you complete this module, make sure you complete Monitoring on AWS. The work you do here builds on the concepts you learn there. In Monitoring on AWS, you answered the following questions with Amazon CloudWatch using metrics and dashboards.

  • How will I know if the application is having performance or availability issues?
  • What happens if my Amazon Elastic Compute Cloud (EC2) instance runs out of capacity?
  • Will I be alerted if my application goes down?

There’s one more question you need to answer: What should you do to prevent these availability, capacity, and reachability issues? 

There are different solutions to these problems and that’s what you learn in this module. First, you learn how to maintain the availability of your website, even if an entire Availability Zone becomes unavailable. Then, you learn about the different ways to solve scalability issues.

What Is Availability?

Your cat photo application users expect the app to be available whenever they want to upload or download a photo. The availability of a system is typically expressed as a percentage of uptime in a given year or as a number of nines. Below, you can see a list of the percentages of availability based on the downtime per year, as well as its notation in nines.

Availability (%)

Downtime (per year)

90% ("one nine")

36.53 days

99% ("two nines")

3.65 days

99.9% ("three nines")

8.77 hours

99.95% ("three and a half nines")

4.38 hours

99.99% ("four nines")

52.60 minutes

99.995% ("four and a half nines")

26.30 minutes

99.999% ("five nines")

5.26 minutes

To increase availability, you need redundancy. This typically means more infrastructure: more data centers, more servers, more databases, and more replication of data. You can imagine that adding more of this infrastructure means a higher cost. 

This is where you balance between availability and cost. Customers want the application to always be available, but you need to draw a line where adding redundancy is no longer viable in terms of revenue. That’s where a service level agreement can be established.

The number of nines can be used to direct how the system has been designed and provide a level of assurance. For example, in the cat photo application, you can provide the availability number back to the users to increase their trust in the platform to store their cat pictures and reassure them that their cat pictures are safe and available. 

Improve Application Availability

A single EC2 instance, connected to the primary copy of an Amazon Relational Database Service (RDS) database that is synchronously replicated to a standby RDS database in another Availability Zone. There’s also an Amazon S3 bucket to store cat photos inside of the Region.

In the current cat photo application, there is only one EC2 instance used to serve the pictures from Amazon Simple Storage Service (S3) and a Multi-AZ Amazon Relational Database Service (RDS) cluster used as the database. That single instance is a single point of failure for the application. 

The database is currently highly available because it’s been deployed on a Multi-AZ RDS cluster, which replicates the data across Availability Zones. In case one fails, you’ll be able to fail over to the second copy. The cat pictures are also highly available because they are stored in Amazon S3 Standard which replicates them across 3 Availability Zones and is designed to offer 4 nines of availability. 

However, even if the database and S3 are highly available, customers have no way to connect if the single instance becomes unavailable. When evaluating the availability of an application, all of its components need to be included and you need to determine if adding additional resources make sense budget wise. One way to solve this single point of failure issue is by adding one more server.

Use a Second Availability Zone

Two EC2 instances in separate Availability Zones for high availability.
The physical location of that server is important. On top of having software issues at the operating system or application level, there can be a hardware issue. It could be in the physical server, the rack, the data center or even the Availability Zone hosting the virtual machine. An easy way to fix the physical location issue is by deploying a second EC2 instance in a different Availability Zone. 

That would also solve issues with the operating system and the application. However, having more than one instance brings new challenges. 

Manage Replication, Redirection, and High Availability

Create a Process for Replication

The first challenge is that you need to create a process to replicate the configuration files, software patches, and application itself across instances. The best method is to automate where you can.

Address Customer Redirection

The second challenge is how to let the clients, the computers sending requests to your server, know about the different servers. There are different tools that can be used here. The most common is using a Domain Name System (DNS) where the client uses one record which points to the IP address of all available servers. However, the time it takes to update that list of IP addresses and for the clients to become aware of such change, sometimes called propagation, is typically the reason why this method isn’t always used. 

Another option is to use a load balancer which takes care of health checks and distributing the load across each server. Being between the client and the server, the load balancer avoids propagation time issues. We discuss load balancers later.

Understand the Types of High Availability

The last challenge to address when having more than one server is the type of availability you need—either be an active-passive or an active-active system. 

  • Active-Passive: With an active-passive system, only one of the two instances is available at a time. One advantage of this method is that for stateful applications where data about the client’s session is stored on the server, there won’t be any issues as the customers are always sent to the same server where their session is stored.
  • Active-Active: A disadvantage of active-passive and where an active-active system shines is scalability. By having both servers available, the second server can take some load for the application, thus allowing the entire system to take more load. However, if the application is stateful, there would be an issue if the customer’s session isn’t available on both servers. Stateless applications work better for active-active systems.

Improve Scalability

Availability and reachability is improved by adding one more server. However, the entire system can again become unavailable if there is a capacity issue. Let’s look at that load issue with both types of systems we discussed, active-passive and active-active.

Vertical Scaling

If there are too many requests sent to a single active-passive system, the active server will become unavailable and hopefully failover to the passive server. But this doesn’t solve anything. 

With active-passive, you need vertical scaling. This means increasing the size of the server. With EC2 instances, you select either a larger type or a different instance type. This can only be done while the instance is in a stopped state. 

In this scenario, the following steps occur: 

  1. Stop the passive instance. This doesn’t impact the application since it’s not taking any traffic.
  2. Change the instance size or type, then start the instance again.
  3. Shift the traffic to the passive instance, turning it active.
  4. The last step is to stop, change the size, and start the previous active instance as both instances should match.

When the amount of requests reduces, the same operation needs to be done. Even though there aren’t that many steps involved, it’s actually a lot of manual work to do. Another disadvantage is that a server can only scale vertically up to a certain limit.

Once that limit is reached, the only option is to create another active-passive system and split the requests and functionalities across them. This could require massive application rewriting.

This is where the active-active system can help. When there are too many requests, this system can be scaled horizontally by adding more servers. 

Horizontal Scaling

As mentioned above, for the application to work in an active-active system, it’s already created as stateless, not storing any client session on the server. This means that having two servers or having four wouldn’t require any application changes. It would only be a matter of creating more instances when required and shutting them down when the traffic decreases. The Amazon EC2 Auto Scaling service can take care of that task by automatically creating and removing EC2 instances based on metrics from Amazon CloudWatch. 

You can see that there are many more advantages to using an active-active system in comparison with an active-passive. Modifying your application to become stateless enables scalability.

Wrap Up

Availability can be measured in terms of downtime and can then be tied to a service level agreement. To make the cat photo application highly available at the EC2 instance layer, another EC2 instance was added. However, availability isn’t just about failure, it’s also about load. 

This is where it’s important to create your application in a stateless way so it can easily scale horizontally instead of vertically. In the next unit, you learn how the cat photo application can be automatically scaled horizontally with Amazon EC2 Auto Scaling.

Resources