Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Access files in s3n://elasticmapreduce/samples/wordcount/input

How I can I access the file sitting in the following folder of S3 which is own by someone else

s3n://elasticmapreduce/samples/wordcount/input

like image 914
iCode Avatar asked Jun 07 '12 01:06

iCode


3 Answers

In Amazon S3, there is no concept of folders, a bucket it just a flat collection of objects. But you can list all the files you are interested in a browser with the following URL: s3.amazonaws.com/elasticmapreduce?prefix=samples/wordcount/input/

Then you can download them by specifying the whole name, e.g. s3.amazonaws.com/elasticmapreduce/samples/wordcount/input/0001

like image 90
matthiash Avatar answered Sep 27 '22 20:09

matthiash


The files in s3n://elasticmapreduce/samples/wordcount/input are public, and made available as input by Amazon to the sample word count Hadoop program. The best way to fetch them is to

  1. Start a new Amazon Elastic MapReduce Job Flow (it doesn't matter which one) from the Amazon Web Services console, and make sure that you keep the the job alive with the Keep Alive option
  2. Once the EC2 machines have started, find the instances on EC2 from the Amazon Web Services console
  3. ssh into one of the running EC2 instances, using the hadoop user, for example ssh -i keypair.pem [email protected]
  4. Obtain the files you need, using hadoop dfs -copyToLocal s3://elasticmapreduce/samples/wordcount/input/0002 .
  5. sftp the files to your local system
like image 26
Sualeh Fatehi Avatar answered Sep 27 '22 19:09

Sualeh Fatehi


You can access wordSplitter.py here:

https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/wordSplitter.py

You can access the input files here:

https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/input/0012
https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/input/0011
https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/input/0010
https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/input/0009
https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/input/0008
https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/input/0007
https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/input/0006
https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/input/0005
https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/input/0004
https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/input/0003
https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/input/0002
https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/input/0001
like image 38
circuitry Avatar answered Sep 27 '22 20:09

circuitry