Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Memory error in amazon sagemaker

Memory error occurs in amazon sagemaker when preprocessing 2 gb of data which is stored in s3. No problem in loading the data. Dimension of data is 7 million rows and 64 columns. One hot encoding is also not possible. Doing so results in memory error. Notebook instance is ml.t2.medium. How to solve this issue?

like image 353
VaRun Sabu Avatar asked Jul 23 '18 10:07

VaRun Sabu


People also ask

What are the limitations of SageMaker?

Maximum number of feature definitions per feature group: 2500. Maximum Transactions per second (TPS) per API per AWS account: Soft limit of 10000 TPS per API excluding the BatchGetRecord API call, which has a soft limit of 500 TPS. Maximum size of a record: 350KB. Maximum size of a record identifier: 2KB.

Does SageMaker need ECR?

Amazon SageMaker currently requires Docker images to reside in Amazon ECR. To push an image to ECR, and not the central Docker registry, you must tag it with the registry hostname.

Where are SageMaker notebooks stored?

In Amazon SageMaker Studio, your SageMaker Studio notebooks and data can be stored in the following locations: An S3 bucket – When you onboard to Studio and enable shareable notebook resources, SageMaker shares notebook snapshots and metadata in an Amazon Simple Storage Service (Amazon S3) bucket.


1 Answers

I assume you're processing on the data on the notebook instance, right? t2.medium has only 4GB of RAM, so it's quite possible you're simply running out of memory.

Have you tried a larger instance? The specs are here: https://aws.amazon.com/sagemaker/pricing/instance-types/

like image 125
Julien Simon Avatar answered Sep 21 '22 10:09

Julien Simon