Wednesday, February 6, 2013

Large Reporting Jobs with Amazon RDS Read Replicas (MySQL)

  I recently came across a project where a monthly report was overwhelming a small multi-AZ MySQL RDS instance. For 29 days of the month the small instance was sufficient, but running the monthly report would not only take days to complete, but it would bog down the entire website while it was running.

  First, a little background. The website in question is a public internet facing website. It consists of two small web server instances behind an Elastic Load Balancer, a small multi-AZ RDS instance, and a small instance used for running administrative processes and reports.

   In a non-cloud environment the only answer would be to either upgrade the database server, or add a second database server dedicated for reporting jobs. Luckily this website was running in AWS.

  With AWS we could easily upgrade the database to a larger instance to support the production load and the monthly report, but the extra capacity provided by the larger instance would be expensive and go unused for all but one day a month. We could manually create a read replica, allowing the reporting job to run against another small instance, but that the job would still take days to complete, and again we'd be paying for unused resources for most of the month. 

  The answer we came up with was simple - we modified the reporting job to take advantage of the AWS API and launch a huge read replica of the database. The job flow went like this:
  1. Launch High Memory Double Extra Large read replica of the RDS instance*
  2. Once the read replica was online execute our monthly report
  3. Destroy read replica once the job completes**
  4. Email report results to interested parties
  This solution is the best of both worlds - The monthly report gets to take advantage of the resources available in the larger database instance to complete in hours instead of days, and the customer only pays for the resources they actually need. Instead of paying several hundred dollars per month to upgrade the multi-AZ instance to support both workloads, the customer ended up paying about $10 a month for better results.

  The moral of the story is that while it's totally possible to treat cloud resources just like traditional servers in a data center, you're usually doing yourself a disservice by doing so. By taking advantage of the extra flexibility and features built into most cloud platforms you can not only save money, but also provide an all around better experience for yourself and your end users.