AWS SageMaker Model as endpoint size limit

https://datascience.stackexchange.com/questions/74236

11-12-2020
|

Question

Is there a size limit imposed on models deployed on AWS SageMaker as endpoints? I first tried to deploy a simple TensorFlow/Keras Iris classification model by converting to protobuf, tarring the model, and deploying. The size of the tarred file was around 10KB, and I was able to deploy that successfully as an endpoint. However, I tried the same process with a Nasnet model where the size of the tarred file ended up being around 350MB, and I got the following error:

The primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint.

Could it be because the model is too large to deploy? I tried increasing the instance type from 'ml.m4.xlarge' to a higher tier but that did not work either.

Solution

It doesn't seem that it is about the size of the model. I am no SageMaker expert, but the error message suggests that the model was deployed, but that something went wrong when the health check was run.

This could be caused by many different things, but the most probable would be that there is a bug in the code. Please check the following:

Can the model be loaded properly?
Can the model make a prediction?
Can the model make multiple predictions?

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange