Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AmazonS3 connection management

Is there a recommended way to manage the connection to AmazonS3 when working with AWS?

Typical Amazon S3 code(taken from Amazon official sample) looks usually like this?

AmazonS3 s3 = new AmazonS3Client(...);
...
s3.putObject(new PutObjectRequest(bucketName, project.getName() + "/" + imageFile.getName(), imageFile));

Following are the questions:

  • Is this a good idea to maintain a single AmazonS3Client used by everyone in the code or is it better to create one on every call?

  • Is there a concept of connection pool like when working with MySQL for example?

  • Are questions like disconnection(MySQL analogy: MySQL was restarted) relevant such that the AmazonS3Client would become invalid and require re-creation? What would be the right way to handle a disconnection if so?

  • Does anyone know what features are provided by the spring integration with aws at:https://github.com/spring-projects/spring-integration-extensions/tree/master/spring-integration-aws

Thx.

like image 618
isaac.hazan Avatar asked May 01 '14 12:05

isaac.hazan


People also ask

How do I manage access to S3 bucket?

User policies – You can use IAM to manage access to your Amazon S3 resources. You can create IAM users, groups, and roles in your account and attach access policies to them granting them access to AWS resources, including Amazon S3. For more information about IAM, see AWS Identity and Access Management (IAM) .

What's the difference between SSE S3 and SSE-KMS?

Server-Side Encryption with AWS KMS keys (SSE-KMS) is similar to SSE-S3, but with some additional benefits and charges for using this service. There are separate permissions for the use of a KMS key that provides added protection against unauthorized access of your objects in Amazon S3.

Who is responsible for S3 bucket access configuration?

By default, all Amazon S3 buckets and objects are private. Only the resource owner which is the AWS account that created the bucket can access that bucket. The resource owner can, however, choose to grant access permissions to other resources and users.

How do I control access to S3 resources?

Restrict access to your S3 resources. By default, all S3 buckets are private and can be accessed only by users who are explicitly granted access. Restrict access to your S3 buckets or objects by doing the following: Writing IAM user policies that specify the users that can access specific buckets and objects.


2 Answers

I'll repeat the questions to be clear:

Is this a good idea to maintain a single AmazonS3Client used by everyone in the code or is it better to create one on every call?

All client classes in the Java SDK are thread safe, so usually it is a better idea to re-use a single client than instantiating new ones. Or a few, if you are operating concurrently on multiple regions or credentials.

Is there a concept of connection pool like when working with MySQL for example?

Yes, there is connection management in the client, specially if you use the TransferManager class instead of the AmazonS3Client directly.

see: http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/transfer/TransferManager.html

Are questions like disconnection(MySQL analogy: MySQL was restarted) relevant such that the AmazonS3Client would become invalid and require re-creation? What would be the right way to handle a disconnection if so?

By default, the client does retries with exponential backoff for recoverable errors. If it really fails/disconnects, you need to handle the exception as appropriate for your app. see: http://docs.aws.amazon.com/general/latest/gr/api-retries.html

Does anyone kwow what fearures are provided by the spring integration with aws at: https://github.com/spring-projects/spring-integration-extensions/tree/master/spring-integration-aws

It provide declarative instantiation, injection and utility classes for easier integration into Spring projects, in a similar way there are helpers for JDBC, JMS, etc...

For more AWS SDK tips and tricks, see: http://aws.amazon.com/articles/3604?_encoding=UTF8&jiveRedirect=1

like image 51
Julio Faerman Avatar answered Oct 19 '22 19:10

Julio Faerman


There are important things to note on the following two questions:

Is this a good idea to maintain a single AmazonS3Client used by everyone in the code or is it better to create one on every call?

Create just one. The AmazonS3Client has a misfeature that when garbage collected, it cleans up resources that are shared by other AmazonS3Client instances, causing those instances to become invalid, even if those other instances are in the middle of handling an upload or download. We had this problem when we were creating an AmazonS3Client for each request. Amazon apparently does not consider this to be a bug. This misfeature can be avoided by creating just one AmazonS3Client, keeping it around for the life of the application, and using it in all threads in your code.

Are questions like disconnection(MySQL analogy: MySQL was restarted) relevant such that the AmazonS3Client would become invalid and require re-creation? What would be the right way to handle a disconnection if so?

Uploads and downloads can fail, but they will not invalidate the AmazonS3Client, which can still be used. The right way to handle a disconnection that is not successfully retried by the AmazonS3Client is to retry yourself or report the failure, as appropriate for your application, and to continue to use the AmazonS3Client for any additional S3 interactions you need to do.

like image 10
Warren Dew Avatar answered Oct 19 '22 19:10

Warren Dew