Question

I am processing a large set of files as follows: I first download all of them to an EC2 instance and then I run a script that sequentially processes each file and indexes it into a DynamoDDB table (each item in the table corresponds to a single line in a file). Then I download the next batch of files etc.

Right now I adjust provisioned throughput via AWS GUI console and it ends up being the same during both the download phase and the indexing phase. Clearly, this is suboptimal since while downloading I do not talk to the database at all so my required write throughput during this phase is essentially 0.

So what I want to do is programmatically adjust provisioned throughput way up when I start indexing and then way down when I stop indexing and start downloading.

Are there any limits on how much I can increase write t/p in a single request? For instance, can I change it from 5 to 120? If not, how do i calculate the number of requests and the time required to adjust t/p from value X to value Y (where X << Y)? Do I have to do similar calculation for decreasing t/p?

I am using Python boto.

Thanks

Was it helpful?

Solution

When scaling up you can only double your write provision, so if you are at 5, you can up to 10, then wait the 20-30 seconds for it to kick in, then double again to 20 and so on. You can scale down as low as you like.

As per the AWS FAQ:

Q: Is there any limit on how much I can change my provisioned throughput with a single request? Yes. Amazon DynamoDB allows you to change your provisioned throughput level by up to 100% with a single UpdateTable API call. If you wish to increase your throughput by more than 100%, you can simply call UpdateTable again.

For example, if your table has 1,000 units of write capacity provisioned, you could not update your table to 3,000 with a single API call as that is more than the maximum allowed change for a single UpdateTable operation. To increase your throughput from 1,000 to 3,000 units of write capacity, simply call UpdateTable to first double your throughput to 2,000, then call UpdateTable a second time to reach 3,000 writes/second.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top