You have the same primary/replica split, but replication is handled entirely within the storage nodes. Queries are handled by compute nodes, which all talk to the same cluster volume. Rather than provisioning storage upfront, the volume grows or shrinks to match your data, up to 128TB-and your bill grows or shrinks to match. This gives you multi-AZ resilience, and it uses the same “pay for what you use” model as S3. In an Aurora cluster, there are different nodes for compute and storage.ĭata is stored in a shared “cluster volume”, which spans six storage nodes and three availability zones. To keep replication manageable, RDS limits you to five replicas. It also takes time to modify the cluster, because new nodes have to replicate all the existing data before they can serve queries. More replicas means more replication traffic, and at some point network I/O becomes a bottleneck. If you want more durability or scale, you need to add more replicas-and that comes at a cost. Different database engines use different consensus protocols for replication to ensure all the nodes have a consistent view of the data.Įach node is responsible for both compute and storage, so the entire database is contained in these nodes. If you write to the primary node, that write gets synchronously replicated to the replica nodes.
In a traditional database cluster, you have one or more nodes (servers or EC2 instances): a read-write primary/writer (W), and read-only replicas (R). You can find a variety of re:Invent sessions, documentation pages, and white papers about the Aurora architecture-if you’re interested, do read further! Here, I’m just going to present a high-level overview. Aurora’s cloud-first architectureĪlthough Aurora is closed-source, Amazon is pretty open about how it works. This “secret sauce” has a number of benefits, so let’s dive in a bit deeper. Storage is handled by a custom data layer, designed to take advantage of Amazon’s cloud infrastructure. Rather than running the entire database on a fleet of EC2 instances, Aurora splits the compute and storage into different pieces. Amazon still handles all the fiddly work of managing the database-but under the hood, it’s quite different. It’s API-compatible with MySQL and PostgreSQL, and it’s meant to be a drop-in replacement. Externally, it behaves like any other RDS database. What if you didn’t have to imitate an existing architecture?Īmazon created Aurora to be a cloud-first database. They also have the Database Migration Service, an underrated tool that does exactly what it says.)īut RDS still looks like a database running in a data center.
Amazon wants to help you move your on-prem databases into RDS, and your account manager may be able to provide credits or professional services to help you along the way. (If you are migrating, talk to your AWS account manager first. If you have databases you want to migrate to the cloud, RDS is a great start. It runs a variety of database engines-including MySQL, PostgreSQL, MariaDB, and Oracle-so it works with your existing application code. Amazon handles all the fiddly bits-provisioning instances, replication and backups, maintenance and updates. You click a few buttons, and you get a database ready to store data. This is less and less common, but it’s useful if you need very fine-grained control over your database.Īmazon created RDS to reduce the management overhead of running a database in EC2. This is like running a database in your data center but with EC2 instances instead of servers. It’s a labor-intensive process-but without the cloud, it’s how you do things.Īs a first step toward the cloud, you can run a database on EC2.
They install software, apply maintenance and security patches, make regular backups, and so on. Your DBAs run databases on servers that you own. If you run a database in your own data center, you’ve probably hired database administrators.
Which begs the question: When should you use RDS and when should you use Aurora? Four approaches to database management
These similarities can make it hard to tell them apart. They both scale to dizzying heights, with terabytes of storage per database. They both let you spin up databases with a few clicks in the console. They’re both managed services, where you pay Amazon to manage and administer your database. If you want to run one in AWS, there are two popular choices: Amazon RDS and Amazon Aurora. Much of the world runs on relational databases.