GAE: how to quantify Frontend Instance Hours usage?

https://stackoverflow.com/questions/23610748

20-07-2023
|

Question

We are developing a Python server on Google App Engine that should be capable of handling incoming HTTP POST requests (around 1,000 to 3,000 per minute in total). Each of the requests will trigger some datastore writing operations. In addition we will write a web-client as a human-usable interface for displaying and analyse stored data.

First we are trying to estimate usage for GAE to have at least an approximation about the costs we would have to cover in future based on the number of requests. As for datastore write operations and data storage size it is fairly easy to come up with an approximate number, though it is not so obvious for the frontend and backend instance hours.

As far as I understood each time a request is coming in, an instance is being started which then is running for 15 minutes. If a request is coming in within these 15 minutes, the same instance would have been used. And now it is getting a bit tricky I think: if two requests are coming in at the very same time (which is not so odd with 3,000 requests per minute), is Google firing up another instance, hence Google would count an addition of (at least) 0.15 instance hours? Also I am not quite sure how a web-client that is constantly performing read operations on the datastore in order to display and analyse data would increase the instance hours.

Does anyone know a reliable way of counting instance hours and creating meaningful estimations? We would use that information to know how expensive it would be to run an application on GAE in comparison to just ordering a web server.

Solution

There's no 100% sure way to assess the number of frontend instance hours. An instance can serve more than one request at a time. In addition, the algorithm of the scheduler (the system that starts the instances) is not documented by Google.

Depending on how demanding your code is, I think you can expect a standard F1 instance to hold up to 5 requests in parallel, that's a maximum. 2 is a safer bet.

My recommendation, if possible, would be to simulate standard interaction on your website with limited number of users, and see how the number of instances grow, then extrapolate.

For example, let's say you simulate 100 requests per minute during 2 hours, and you see that GAE spawns 5 instances for that, then you can extrapolate that a continuous load of 3000 requests per minute would require 150 instances during the same 2 hours. Then I would double this number for safety, and end up with an estimate of 300 instances.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow