Manage Databases with AWS
Learning Objectives
After completing this unit, you’ll be able to:
- Describe the function of databases with AWS.
- Identify and explain the benefits of Amazon RDS, Amazon Aurora, Amazon Neptune, Amazon Redshift, and Amazon Document DB.
- Describe Amazon ElastiCache and its benefits.
- Explain the differences between Memcached and Redis.
Imagine your company has grown from a small startup to a large-scale organization with many product offerings. To keep up with the demand you need a scalable infrastructure that is easy to administer and is cost-effective. With AWS database services, you can support your growth while maintaining flexibility by only paying for what you use.
Customize Your Database with AWS Database Services
Databases allow you to store, manage, and query structured data. Although you can host database software on Amazon EC2, AWS offers a variety of purpose-built database services that are easy to set up, manage, and maintain.
AWS Database services provide several advantages.
- Pick the best database to solve specific problems.
- Start small and scale as your applications grow.
- Pay as little as 1/10th the cost of commercial databases.
- Stop worrying about server provisioning, patching, setup, configuration, backups, or recovery of your database.
- Benefit from the high availability, reliability, and security that fully managed AWS database services offer.
Choose from Relational and Key-Value Databases
The most common types of databases are relational (SQL) and key-value or non-relational (NoSQL).
SQL databases store data in rows and columns like a spreadsheet. Rows contain all the information about one entry, and columns are the attributes that separate the data points. An SQL database schema is fixed: columns must be locked before data entry. You can amend schemas if the database is altered entirely and taken offline.
Data is queried using structure query language (SQL), which can allow for complex queries. SQL databases scale vertically by increasing hardware power. Relational databases are commonly used for traditional applications, ERP, CRM, and ecommerce.
NoSQL databases store data using one of many storage models, including key-value pairs, documents, and graphs. NoSQL schemas are dynamic, and information can be added rapidly. Each row doesn’t have to contain data for each column.
Data in NoSQL databases is queried by focusing on collections of documents. NoSQL databases scale horizontally by increasing servers. Key-value databases are commonly used for Internet-scale applications, real-time bidding, shopping carts, and customer preferences.
Relational (SQL)
|
Key-value (NoSQL) |
|
---|---|---|
Data storage
|
Rows and columns |
Key-value, document, graph |
Schemas
|
Fixed |
Dynamic |
Querying
|
Using SQL |
Focused on collection of documents |
Scalability
|
Vertical |
Horizontal |
Example
|
Meet Amazon Relational Database Service (Amazon RDS)
Amazon RDS is available on several database instance types—optimized for memory, performance, or input/output (I/O)—and provides you with six familiar database engines to choose from.
- Amazon Aurora
- PostgreSQL
- MySQL
- MariaDB
- Oracle
- Microsoft SQL Server
You can use the AWS Database Migration Service to easily migrate or replicate your existing databases to Amazon RDS.
Meet Amazon Aurora
Amazon Aurora is a MySQL and PostgreSQL-compatible relational database for any enterprise application that can use a relational database.
Amazon Aurora offers a broad array of features.
- Enterprise-class relational database.
- MySQL- or PostgreSQL-compatible.
- Up to 5X faster than standard MySQL databases.
- Up to 3X faster than standard PostgreSQL databases.
- A distributed, fault-tolerant, self-healing storage system.
- Automatically scales up to 64 TB per database instance.
- Nearly continuous backup to Amazon S3.
- Replication across three Availability Zones.
- Up to 15 low-latency read replicas.
- Point-in-time recovery.
Meet Amazon DynamoDB
Amazon DynamoDB is a fast and flexible non-relational database service for all applications that need consistent, single-digit millisecond latency at any scale. It supports both document and key-value store models and has several additional features.
- Fully managed
- Low-latency queries
- Fine-grained access control
- Regional and global options
Amazon DynamoDB is ideal for many use cases, including serverless web applications, microservices data store, mobile backends, gaming, and more.
Explore Other Database Services
Database type |
Use cases |
AWS service |
---|---|---|
In-memory
|
Caching, session management, gaming leaderboards, geospatial applications |
Amazon ElastiCache |
Document
|
Content management, catalogs, user profiles |
Amazon DocumentDB |
Wide column
|
High scale industrial apps for equipment maintenance, fleet management, and route optimization |
Amazon Managed Apache Cassandra Service |
Graph
|
Fraud detection, social networking, recommendation engines |
Amazon Neptune |
Time series
|
IoT applications, DevOps, industrial telemetry |
Amazon Timestream |
Ledger
|
Systems of record, supply chain, registrations, banking transactions |
Amazon Quantum Ledger Database |
What Is Amazon ElastiCache?
Amazon ElastiCache is a web service that makes it easy to deploy, operate, and scale an in-memory data store or cache in the cloud. The service improves application performance by enabling developers to retrieve information from fast, managed, in-memory data stores instead of relying on slower disk-based databases.
When a read request is sent, the caching layer checks to determine whether it has the answer. If it doesn’t, the request is sent to the database. Answering read requests through the caching layer is more efficient and delivers higher performance than a traditional database alone. It’s also more cost-effective.
Amazon ElastiCache supports two open-source in-memory engines.
- Memcached
- Redis
Amazon ElastiCache for Memcached
The primary use case for Memcached is caching; it is easy to use and scale. ElastiCache is protocol-compliant with Memcached, so tools used with existing Memcached environments work seamlessly with ElastiCache.
Memcached is well-suited for caching relatively small and static data, whereby the primary concern is fast read performance.
Amazon ElastiCache for Redis
Redis is an in-memory NoSQL data store that supports persistence, availability, and scripting. It comes with a set of in-memory data structures that make it easy to create a variety of custom applications.
Redis is often used for:
- Caching
- Session management
- Pub/sub
- Leaderboards
Because of its speed and ease of use and proven performance, Redis is a popular choice for:
- Web
- Mobile
- Gaming
- Adtech
- Internet of Things (IoT) applications
Redis has a broader set of features than Memcached and performs well for both reads and writes.
What Is Amazon Neptune?
Amazon Neptune is a fast, reliable, fully managed graph database service that makes it easy to build and run applications that work with highly connected datasets. Its purpose-built, high-performance graph database engine can store billions of relationships and query the graph with milliseconds latency.
Amazon Neptune supports popular graph models, such as Property Graph and W3C’s RDF, and their respective query languages Apache TinkerPop and Gremlin-SPARQL. Easily build queries that efficiently navigate highly connected datasets.
Amazon Neptune is:
- Highly available, with read replicas, point-in-time recovery, continuous backup to Amazon S3, and replication across Availability Zones.
- Secure with support for encryption at rest.
- Fully managed, so you don’t need to worry about database management tasks such as hardware provisioning, software patching, setup, configuration, or backups.
Use cases include:
- Social networking
- Recommendation engines
- Fraud detection
- Knowledge graphs
- Life sciences
- Network/IT operations
Meet Amazon Redshift
Amazon Redshift is a fast, scalable data warehouse that makes it simple and cost-effective to analyze all your data across your data warehouse and data lake. Amazon Redshift delivers 10 times faster performance than other data warehouses by using machine learning, massively parallel query execution, and columnar storage on high-performance disks.
You can set up and deploy a new data warehouse in minutes. You can also run queries across petabytes of data in your Amazon Redshift data warehouse and exabytes of data in your data lake built on Amazon S3.
Amazon Redshift is an online analytical processing (OLAP) system as opposed to Amazon RDS databases, which are online transaction processing (OLTP).
- OLTP databases usually process a large number of small transactions and are often used to provide source data to data warehouses.
- OLAP systems usually process a small number of complex queries that help analyze data.
Meet Amazon DocumentDB
Amazon DocumentDB (with MongoDB compatibility) is a fast, scalable, highly available, and fully managed document database service that supports MongoDB (a cross-platform, document-oriented, NoSQL database) workloads.
Amazon DocumentDB is designed to give you the performance, scalability, and availability you need when operating mission-critical MongoDB workloads at scale. Amazon DocumentDB implements the Apache 2.0 open source MongoDB 3.6 API by emulating the responses that a MongoDB client expects from a MongoDB server. This capability enables you to use your existing MongoDB drivers and tools with Amazon DocumentDB.
Resources
-
Whitepaper: Database
-
External Site: Amazon Relational Database Service (RDS)
-
External Site: Amazon Aurora
-
External Site: Amazon DynamoDB
-
External Site: Amazon ElastiCache
-
External Site: Amazon Neptune
-
External Site: Amazon Redshift
-
External Site: What is a data lake?
-
External Site: Amazon DocumentDB (with MongoDB compatibility)