Skip to main content

Store and Retrieve Data with AWS

Learning Objectives

After completing this unit, you’ll be able to:

  • Describe the function of AWS storage services.
  • Identify the differences between Amazon S3 Storage Classes.
  • Explain Object Lifecycle Management.
  • Explain the features of Amazon Elastic Block Store.
  • Explain the features of Amazon Elastic File System.

Storage service category icon depicting a file cabinet drawer with a document protruding against a green background

In the previous unit, you learned about AWS Compute services, but what good is compute without data? Cloud storage holds information used by applications.

Imagine you are the technical manager of an enterprise data center. For years, the company has been using a legacy system to back up data, archiving everything from employee records to sales history. You discover that it’s at risk for failure. On top of that, the cost of maintenance and ownership has become hard to justify.  

With AWS storage services, you can store the company’s short-term and archival data securely and save on operational costs by eliminating the need to replace outdated servers.

Get to Know Amazon Simple Storage Service (Amazon S3)

Amazon Simple Storage Service icon depicting a bucket against a green background

Amazon S3 is storage for the Internet.

Amazon S3 stores data as objects, which contain a file and metadata. Objects are uploaded and stored in buckets. Each object can be up to 5 terabytes in size, and you can store an unlimited number of objects in a bucket. Each bucket is located in an AWS Region you specify. Once your objects are stored in a bucket, you can access them from anywhere on the web using HTTP or HTTPS endpoints.

Given that you can access it anywhere on the web, it’s important to consider latency optimization, cost minimization, or regulatory compliance when choosing the AWS Region that hosts your bucket.

Amazon S3 offers a variety of features.

  • Control access to both the bucket and the objects—for example: control who can create, delete, and retrieve buckets or objects in buckets.
  • Support Secure Sockets Layer (SSL) for data in transit and encryption for data at rest.
  • View access logs for the bucket and its objects.
  • Configure an Amazon S3 bucket to host a stand-alone static website.
  • Use versioning to preserve, retrieve, and restore every version of every object stored in your Amazon S3 bucket.
Note

Amazon S3 is appropriate for a wide variety of use cases, including cloud applications, websites, content distribution, mobile and gaming applications, and big data analytics. 

What Is Amazon S3 Glacier?

Amazon S3 Glacier icon depicting a bucket with a snowflake against a green background

Amazon S3 Glacier is a secure, durable, and low-cost cloud storage service for data archiving and long-term backup.

Unlike Amazon S3, data stored in Amazon S3 Glacier has an extended retrieval time ranging from minutes to hours. Retrieving data from Amazon S3 Glacier has a small cost per GB and per request.

Amazon S3 Glacier offers three options for retrieving data with varying access times and cost.

  • Standard retrievals typically complete within 3 to 5 hours.
  • Bulk retrievals are Amazon S3 Glacier’s lowest-cost retrieval option and typically complete within 5-12 hours.
  • Expedited retrievals are typically made available within 1 to 5 minutes and can be provisioned in advance to ensure retrieval capacity will be available.

Use Amazon S3 Glacier if low storage cost is paramount, and you don’t require millisecond access to your data.

Use cases include:

  • Media asset workflows
  • Healthcare information archiving
  • Regulatory and compliance archiving
  • Scientific data storage
  • Digital preservation
  • Magnetic tape replacement
Note

The Amazon S3 Glacier Deep Archive storage class is an even more cost-effective way to store important, infrequently accessed data. Data stored in Amazon S3 Glacier Deep Archive can be retrieved within 12 hours.

Understand Object Lifecycle Management and Amazon S3 Storage Classes

With lifecycle configuration rules, you can tell Amazon S3 to automatically transition objects to less expensive storage classes, archive, or delete them after a set period of time. For example, you can set an Amazon S3 bucket to archive objects to Amazon S3 Glacier after 30 days, and then delete them after 5 years.

Diagram showing an object being archived from an Amazon S3 bucket to Amazon S3 Glacier after 30 days and then being deleted after 5 years

Amazon S3 offers a range of storage classes designed for different use cases. Storage classes can be configured at the object level, and a single bucket can contain objects stored across S3 Standard, S3 Intelligent-Tiering, S3 Standard-IA, and S3 One Zone-IA. More on these classes later. Before that, let’s discuss some terms.

Durability is the chance that you will be able to retrieve an object from storage. It’s important to note that all classes are designed for high durability.

All storage classes are designed for durability of 99.999999999% for objects across multiple availability zones (except Amazon S3 One Zone-Infrequent Access, which is in a single availability zone). This corresponds to an average annual expected loss of 0.000000001% of objects. For example, if you store 10,000,000 objects in Amazon S3, you can on average expect to incur a loss of a single object once every 10,000 years.

In addition, Amazon S3 Standard, S3 Standard-IA, and S3 Glacier are all designed to sustain data in the event of an entire S3 Availability Zone loss.

Availability describes the percentage of “uptime” when it is possible to retrieve an object at the moment you attempt to retrieve it.

S3 Storage Class

Description

Amazon S3 Standard (S3 Standard)

  • General purpose for frequently accessed data
  • Designed for 99.99% availability over a given year

Amazon S3 Intelligent-Tiering (S3 Intelligent-Tiering)

  • Ideal for long-lived data with access patterns that are unknown or unpredictable
  • Automatically moves objects between two access tiers based on changing access patterns
  • Designed for 99.9% availability over a given year
  • Small monthly monitoring and auto-tiering fee

Amazon S3 Standard-Infrequent Access (S3 Standard-IA)

  • For data that is accessed less frequently but requires rapid access when needed
  • Low per GB storage price and per GB retrieval fee
  • Designed for 99.9% availability over a given year

Amazon S3 One Zone-Infrequent Access (S3 One Zone-IA)

  • For data that is infrequently accessed and requires rapid access when needed but does not require the availability and resilience of S3 Standard or S3 Standard-IA
  • Stores data in a single AZ, so data will be lost in the event of AZ destruction (unlike other S3 Storage Classes, which store data in at least three AZs)
  • Ideal for backup copies or easily re-creatable data
  • Costs 20% less than S3 Standard-IA
  • Designed for 99.5% availability over a given year

Amazon S3 Glacier

  • Low-cost storage class ideal for data archiving
  • Data is not available for real-time access
  • Three retrieval options ranging from minutes to hours

Amazon S3 Glacier Deep Archive

  • Lowest-cost storage class designed for long-term data retention (7 to 10 years or longer)
  • Retrieval time within 12 hours
Note

Amazon S3 Standard, S3 Standard-IA, and S3 Glacier are all designed to sustain data in the event of an entire S3 Availability Zone loss. Additional best practices for Amazon S3 safeguards include secure access permissions, Cross-Region Replication, versioning, and a functioning, regularly tested backup.

Get Persistent Block Storage with Amazon Elastic Block Store (Amazon EBS)

Amazon Elastic Block Store icon depicting a database with arrows pointing outward against a green background

Amazon EBS allows you to create persistent block storage volumes and attach them to Amazon EC2 instances. Without storage attached, EC2 instances use ephemeral storage that only lasts until the instance stops, terminates, or the underlying disk drive fails.

Each Amazon EBS volume is automatically replicated within its Availability Zone to protect you from component failure, offering high availability and durability. The Elastic Volumes feature enables you to increase or decrease capacity and change the type of an existing volume with no downtime or performance impact.

Different drive types are available, depending upon your specific needs.

Solid State Drives (SSD)

  • Provisioned IOPS SSD (io1, io2, io2 Block Express) volumes
  • General Purpose SSD (gp2, gp3) volumes

Hard Disk Drives (HDD)

  • Throughput Optimized HDD (st1) volumes
  • Cold HDD (sc1) volumes

Store Files with Amazon Elastic File System (Amazon EFS)

Amazon Elastic File System icon depicting a folder in the cloud with arrows pointing outward against a green background

Amazon EFS provides a simple, scalable, fully managed, elastic file system. It automatically scales as you add or remove files, so you don’t need to worry about provisioning capacity to accommodate growth.

Amazon EFS makes it easy to migrate existing enterprise applications to the AWS Cloud. Use Amazon EFS for analytics, web serving and content management, application development and testing, media and entertainment workflows, database backups, and container storage.

Resources

Keep learning for
free!
Sign up for an account to continue.
What’s in it for you?
  • Get personalized recommendations for your career goals
  • Practice your skills with hands-on challenges and quizzes
  • Track and share your progress with employers
  • Connect to mentorship and career opportunities