I deployed a large 3D model to aws sagemaker. Inference will take 2 minutes or more. I get the following error while calling the predictor from Python:
An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (0) from model with message "Your invocation timed out while waiting for a response from container model. Review the latency metrics for each container in Amazon CloudWatch, resolve the issue, and try again."'
In Cloud Watch I also see some PING time outs while the container is processing:
2020-10-07T16:02:39.718+02:00 2020/10/07 14:02:39 https://forums.aws.amazon.com/ 106#106: *251 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 10.32.0.2, server: , request: "GET /ping HTTP/1.1", upstream: "http://unix:/tmp/gunicorn.sock/ping", host: "model.aws.local:8080"
How do I increase the invocation time out?
Or is there a way to make async invocations to an sagemaker endpoint?
SageMaker does not allow you to schedule training jobs. SageMaker does not provide a mechanism for easily tracking metrics logged during training. We often fit feature extraction and model pipelines. We can inject the model artifacts into AWS-provided containers, but we cannot inject the feature extractors.
Manage interactions with the Amazon SageMaker APIs and any other AWS services needed. This class provides convenient methods for manipulating entities and resources that Amazon SageMaker uses, such as training jobs, endpoints, and input datasets in S3.
Today, we're pleased to announce that Amazon SageMaker Autopilot experiments run up to 2x faster to generate ML models with high model performance.
Content type options for Amazon SageMaker algorithm inference requests include: text/csv , application/json , and application/x-recordio-protobuf . Algorithms that don't support all of these types can support other types. XGBoost, for example, only supports text/csv from this list, but also supports text/libsvm .
It’s currently not possible to increase timeout—this is an open issue in GitHub. Looking through the issue and similar questions on SO, it seems like you may be able to use batch transforms in conjunction with inference.
https://stackoverflow.com/a/55642675/806876
Sagemaker Python SDK timeout issue: https://github.com/aws/sagemaker-python-sdk/issues/1119
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With