AWS DynamoDB, the Internet Scale NoSQL database technology with built-in high availability and scalability, features powerful horizontal scaling capabilities. So to say, it has a unique pricing strategy. And hence optimizing cost for it is slightly different from other AWS tools. As the optimization is tied up with its architectural aspects, there are a few points to be taken care of while auto-scaling AWS DynamoDB.
NoSQL database is now a common component of the technology stack of ecommerce apps, which require high-scalability. Amazon’s DynamoDB, which was created by leveraging the best practices at amazon.com, has a unique feature of ultra-fast access rates. For the reason that: The data in the tables are stored in Solid-state-drives (SSDS) which are internally managed by the platform itself. DynamoDB stores data tables internally as partitions of storage blocks with auto data-replication across multiple availability zones.
Auto-scaling AWS DynamoDB: How to go about it?
To optimize and autoscale AWS DynamoDB, users have to have a thorough understanding of the internal architectural model of the data partitioning strategy. Here’s a thumb-rule for computing the approximate number of partitions:
Approximate number of internal DynamoDB partitions = (R + W * 3) / 3000.
R = Provisioned Read IOPS per second for a table
W = Provisioned Write IOPS per second for a table
The Data tables are automatically split into multiple partitions when data is crossing the partition block size. The tables should be designed such as that the table ‘s key which will determine in which partition the data element will be stored is distributing the key values uniformly across the all the partitions. It is the architect’s responsibility that the Partition Keys generated are equal across the range between the topmost value and the bottom-most value. If the keys are frequently mapped to a narrower range, one partition would get more number of access and others will get less number of access. The cost of table will be biased towards the rate of accessing the values within that single partition instead of the average access rate across partitions.
Along With a basic understanding of the Partition Model, users should also understand the following auto-scaling conditions that needs to be met before an attempt to set up auto-scaling DynamoDB is made.
- The R and W time based access pattern should be ensured to be uniform before auto-scaling
- access keys are distributing the keys uniformly
- the average size of the partition is < 10 GB
If the above conditions for auto-scaling are met, then the users can confidently try auto-scaling DynamoDB database. Here are the five tips, which would help you in setting up auto-scaling AWS DynamoDB.
- Analyze Data Traffic Patterns to Predict and Adjust Limits
Analytics on Data traffic can be as simple as a moving average of the data volume over a period of time. Check whether the spikes in traffic is above or below the moving average. If the spike is above, we have to increase the provisioned throughput parameters. If it is less, we can confidently lower the parameters.
- Factors of Standard-Deviation as Risk Mitigation
One of the important factor to consider is the risk management. Since the app would throw error when the provisioned throughput parameters have been crossed, we should have to account for unpredictable surge of data traffic. The risk measure to be adopted is a standard technique in financial trading algorithms. An extra quantity equaling one or two standard-deviation of the mean traffic value should be added to the mean traffic value before setting up the thresholds. The standard deviations will account for any unexpected data traffic and avoid errors.
- Ensure Key Distribution as Uniform and Adjust Threshold Limit after Partitions are Created
At the design level, the architect should ensure that the partition keys are generating values which are equally distributed and not skewing any particular intermediate value in the key range.
- Cache Popular Items in Memory
The architect should also compute the most frequently accessed key-value pairs and move those values to a memory-cache outside the DynamoDB so that cost can be reduced drastically.
- Scale Up Faster and Scale Down Slower
While setting up auto-scaling, it is important to understand the AWS restricts the scale down limit to just 4 per day. Hence we should be scaling up faster if thresholds are breached but scaled down slower only if three of four additional lower level thresholds are breached so ensure that we do not cross the scale down limit of 4 per day.
To Wrap Up:
AWS DynamoDB is a critical component of any large enterprise, especially ecommerce player. Because Product Catalog is the most accessed information resource of the system. Optimally managing it, without overpaying, is a task for the DevOps. If you are a DevOps, you should take a quick look at Botmetric to see how it can ease your day to day operations.
Latest posts by Editor (see all)
- May Roundup @ Botmetric: Deeper AWS Cost Analysis and Continuous Security - May 31, 2017
- What is NoOps, Is it Agile Ops? - May 25, 2017
- Why Botmetric Chose InfluxDB — A Real-Time Metrics Data Store That Works - May 18, 2017