I have some data in S3 and I want to create a lambda function to predict the output with my deployed aws sagemaker endpoint then I put the outputs in S3 again. Is it necessary in this case to create an api gateway like decribed in this link ? and in the lambda function what I have to put. I expect to put (where to find the data, how to invoke the endpoint, where to put the data)
import boto3
import io
import json
import csv
import os
client = boto3.client('s3') #low-level functional API
resource = boto3.resource('s3') #high-level object-oriented API
my_bucket = resource.Bucket('demo-scikit-byo-iris') #subsitute this for your s3 bucket name.
obj = client.get_object(Bucket='demo-scikit-byo-iris', Key='foo.csv')
lines= obj['Body'].read().decode('utf-8').splitlines()
reader = csv.reader(lines)
import io
file = io.StringIO(lines)
# grab environment variables
runtime= boto3.client('runtime.sagemaker')
response = runtime.invoke_endpoint(
EndpointName= 'nilm2',
Body = file.getvalue(),
ContentType='*/*',
Accept = 'Accept')
output = response['Body'].read().decode('utf-8')
my data is a csv file of 2 columns of floats with no headers, the problem is that lines return a list of strings(each row is an element of this list:['11.55,65.23', '55.68,69.56'...]) the invoke work well but the response is also a string: output = '65.23\n,65.23\n,22.56\n,...'
So how to save this output to S3 as a csv file
Thanks
If your Lambda function is scheduled, then you won't need an API Gateway. But if the predict action will be triggered by a user, by an application, for example, you will need.
When you call the invoke endpoint, actually you are calling a SageMaker endpoint, which is not the same as an API Gateway endpoint.
A common architecture with SageMaker is:
By the situation you describe, I can't say if your task is some academic stuff or a production one.
So, how you can save the data as a CSV file from your Lambda?
I believe you can just parse the output, then just upload the file to S3. Here you will parse manually or with a lib, with boto3 you can upload the file. The output of your model depends on your implementation on SageMaker image. So, if you need the response data in another format, maybe you will need to use a custom image. I normally use a custom image, which I can define how I want to handle my data on requests/responses.
In terms of a production task, I certainly recommend you check Batch transform jobs from SageMaker. You can provide an input file (the S3 path) and also a destination file (another S3 path). The SageMaker will run the batch predictions and will persist a file with the results. Also, you won't need to deploy your model to an endpoint, when this job run, will create an instance of your endpoint, download your data to predict, do the predictions, upload the output, and shut down the instance. You only need a trained model.
Here some info about Batch transform jobs:
https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-batch.html
https://docs.aws.amazon.com/sagemaker/latest/dg/ex1-batch-transform.html
I hope it helps, let me know if need more info.
Regards.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With