Recently I've developped some statistics logic to query mysql database and collect the results, every single query is optimized. but because of the logic, I will need to issue a lot of queries concurrently, I may need to issue about 353820000 sql queries agains mysql database within 24 hours. then the database is freeup. to be able to handle all the sql quries, I will need very large rds instnace e.g db.r5.4xlarge, but the problem is I will only need this about 24hours each week, it's IDLE the rest of the week. so using reserved rds instance will waste a lot of money, then I discovered aurora serverless

aurora

Amazon Aurora Serverless is an on-demand, auto-scaling configuration for Amazon Aurora (MySQL-compatible and PostgreSQL-compatible editions), where the database will automatically start up, shut down, and scale capacity up or down based on your application's needs. It enables you to run your database in the cloud without managing any database instances. It's a simple, cost-effective option for infrequent, intermittent, or unpredictable workloads.

this is what I need, but keep in mind that there is No Silver Bullet, Aurora Serverless has it's own limitation, by the time of writing this article, it only support mysql version 5.6.10a . we are using mysql v5.7 in production this means I could not create read replica from master node directly, I will have to have a script to export the data and import it into Aurora Serverless. I use mysqldump to export the table I need for the queries , I just select a sub tables to export and ignore unrelated tables to save the export/import time.

#export
mysqldump -h prod-mysql-biz-master.vipmind.me -u xxx -pxxxxxx \
	--opt \
	--set-gtid-purged=OFF \
	--skip-triggers --skip-comments \
	--ignore-table=xxxxdb.xxx_table1 \
	--ignore-table=xxxxdb.xxx_table2 \
	--single-transaction \
	--databases xxxxdb > xxxxdb.sql
#import
mysql -h prod-mysql-biz-stats.vipmind.me -u xxxx -pxxx < xxxxdb.sql

After the data is loaded into Aurora Serverless, we could config how to scale Aurora Serverless, it has a concept called Aurora Capacity Unit (ACU) to represent how much capacity will be avaialble in the db server, Aurora Serverless will do autoscaling based on the server load, if there is no connection/activities for a configged time period, the db capacity will be scalled down to 0 which means no cost at all for db server compute resources, but we do need to pay the db storage resources.

Since I'm using HikariCP to manage the database conneciton pool, I will need to make sure the db connection pool will not hold any connection if there is no activity , so that Aurora Serverless could scale down to 0 ACU, within the HikariCP it's easy to do that, code below could do what I want when config HikariCP

// do not use connections unless they are needed
hikariConfig.setMinimumIdle(0);

// maximum wait time for a connection to lay idle in pool - 5 minutes
hikariConfig.setIdleTimeout(30000);
// maximum wait time for a connection from the pool - 5 minutes
hikariConfig.setConnectionTimeout(300000);

The end result of this change works very well, I could complete the data flow job within 24 hours with Aurora Serverless scalue up to 64 ACU, the peak ACU I've seen is 128 ACU but it's just for a short period of time.after the flow job complete Aurora Serverless just scale down to 0 ACU it's really help me to save a lot of cost and also achieve very high analytics performance.

server_load