I am trying to read files from AWS S3 and process it with Spring Batch:
Can a Spring Itemreader process this Task? If so, How do I pass the credentials to S3 client and config my spring xml to read a file or multiple files
<bean id="itemReader" class=""org.springframework.batch.item.file.FlatFileItemReader"">
<property name="resource" value=""${aws.file.name}"" />
</bean>
Update To use the Spring-cloud-AWS you would still use the FlatFileItemReader but now you don't need to make a custom extended Resource.
Instead you set up a aws-context and give it your S3Client bean.
<aws-context:context-resource-loader amazon-s3="amazonS3Client"/>
The reader would be set up like any other reader - the only thing that's unique here is that you would now autowire your ResourceLoader
@Autowired
private ResourceLoader resourceLoader;
and then set that resourceloader:
@Bean
public FlatFileItemReader<Map<String, Object>> AwsItemReader() {
FlatFileItemReader<Map<String, Object>> reader = new FlatFileItemReader<>();
reader.setLineMapper(new JsonLineMapper());
reader.setRecordSeparatorPolicy(new JsonRecordSeparatorPolicy());
reader.setResource(resourceLoader.getResource("s3://" + amazonS3Bucket + "/" + file));
return reader;
}
I would use the FlatFileItemReader and the customization that needs to take place is making your own S3 Resource object. Extend Spring's AbstractResource to create your own AWS resource that contains the AmazonS3 Client, bucket and file path info etc..
For the getInputStream use the Java SDK:
S3Object object = s3Client.getObject(new GetObjectRequest(bucket, awsFilePath));
return object.getObjectContent();
Then for contentLength -
return s3Client.getObjectMetadata(bucket, awsFilePath).getContentLength();
and lastModified use
.getLastModified().getTime();
The Resource you make will have the AmazonS3Client which contains all the info your spring-batch app needs to communicate with S3. Here's what it could look like with Java config.
reader.setResource(new AmazonS3Resource(amazonS3Client, amazonS3Bucket, inputFile));
More simple steps are:
Firstly, you need to create AWSS3 client and ResourceLoader bean in your aws configuration file, like this.
@Configuration
@EnableContextResourceLoader
public class AWSConfiguration {
@Bean
@Primary
public AmazonS3 getAmazonS3Cient() {
ClientConfiguration config = new ClientConfiguration();
config.setConnectionTimeout(5000 * 10);
config.setSocketTimeout(5000 * 10);
return AmazonS3ClientBuilder.standard()
.withClientConfiguration(config).build();
}
@Bean
@Autowired
public static ResourceLoaderBeanPostProcessor resourceLoaderBeanPostProcessor(
AmazonS3 amazonS3EncryptionClient) {
return new ResourceLoaderBeanPostProcessor(amazonS3EncryptionClient);
}
}
Then use resourceloader bean in ItemReader to set S3 resources.
@Autowired
private ResourceLoader resourceLoader;
@Bean
public FlatFileItemReader<String> fileItemReader() {
FlatFileItemReader<String> reader = new FlatFileItemReader<>();
reader.setLineMapper(new JsonLineMapper()); //Change line mapper as per your need
reader.setResource(resourceLoader.getResource("s3://" + amazonS3Bucket + "/" + file));
return reader;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With