Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How S3 select pricing works? What is data returned and scanned in s3 select means

I have a 1M rows of CSV data. select 10 rows, Will I be billed for 10 rows. What is data returned and data scanned means in S3 Select?

There is less documentation on these terms of S3 select

like image 416
bharath reddy Avatar asked Oct 26 '18 04:10

bharath reddy


1 Answers

To keep things simple lets forget for some time that S3 reads in a columnar way. Suppose you have the following data:

| City       | Last Updated Date   |
|------------|---------------------|
| London     | 1st Jan             |
| London     | 2nd Jan             |
| New Delhi  | 2nd Jan             |

A query for fetching the latest update date

  • forces S3 to scan all 3 records
  • but the returned records are only 2 (when the last updated date is 2nd Jan)

A query of select city where last updated date is 1st Jan,

  • will scan all 3 rows
  • but return only 1 string - "New Delhi".

Hence based on your query, it might scan more data (3 rows) but return less data (2 rows).

I hope you understand the difference between Data Scanned and Data Returned now.

like image 85
Pulkit Agarwal Avatar answered Sep 27 '22 18:09

Pulkit Agarwal