Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Log delay in Amazon S3

I have recently hosted in Amazon S3, and I need the log files to calculate the statistics for the "get", "put", "list" operations in the objects.

And I've observed that the log files are organized weirdly. I don't know when the log will appear(not immediatly, at least 20 minutes after the operation) and how many lines of logs will be contained in one log file.

After that, I need to download these log files and analyse them. But I can't figure out how often I will do this.

Can somebody help? Thanks.

like image 951
Lulu Avatar asked May 02 '13 13:05

Lulu


People also ask

What is the latency of S3?

AWS S3 provides a great performance. It automatically scales to high request rates, with a very low latency of 100–200 milliseconds. Your application can achieve at least 3,500 PUT/COPY/POST/DELETE and 5,500 GET/HEAD requests per second per prefix in a bucket.

How fast is S3 latency?

These applications can achieve consistent small object latencies (and first-byte-out latencies for larger objects) of roughly 100–200 milliseconds.

Does S3 have a logging mechanism?

Amazon S3 periodically collects access log records, consolidates the records in log files, and then uploads log files to your target bucket as log objects. If you enable logging on multiple source buckets that identify the same target bucket, the target bucket will have access logs for all those source buckets.


1 Answers

What you describe (log files being made available with delays and being in unpredictable order) is exactly what is declared by AWS as behaviour to expect. This is by nature of distributed system, AWS S3 is using to provide S3 service, the same request may be served each time from different server - I have seen 5 different IP addresses being provided for publishing.

So the only solution is: accept the delay, see the delay you experience and add some extra time and learn living with this total delay (I would expect something like 30 to 60 minutes, but statistics could tell more).

If you need log records ordered, you have either sort them yourself, or search for some log processing solutions - I have seen some applications being offered exactly for this purpose.

In case, you really need to get your log file with very short delay, you have to make the logs yourself and this means, you have to write and run some frontend, which gives access to your files on S3 and at the same time keeps logging as needed.

I run such a solution, users get user name and password and url of my frontend. As they send the request, I evaluate, if they provide proper credentials and if they are allowed to see given resource, and if so, I create few minutes valid temporary url for that resource and redirect the request to that.

But such a fronted costs money (you have to run your frontend somewhere) and is less robust, then accessing directly the AWS S3.

Good luck, Lulu.

like image 63
Jan Vlcinsky Avatar answered Oct 27 '22 06:10

Jan Vlcinsky