Read a text file from HDFS line by line in mapper

Tags:

Is the following code for Mappers, reading a text file from HDFS right? And if it is:

What happens if two mappers in different nodes try to open the file at almost the same time?
Isn't there a need to close the InputStreamReader? If so, how to do it without closing the filesystem?

My code is:

Path pt=new Path("hdfs://pathTofile");
FileSystem fs = FileSystem.get(context.getConfiguration());
BufferedReader br=new BufferedReader(new InputStreamReader(fs.open(pt)));
String line;
line=br.readLine();
while (line != null){
System.out.println(line);

329

asked Jan 28 '13 23:01

nik686

1 Answers

This will work, with some amendments - i assume the code you've pasted is just truncated:

Path pt=new Path("hdfs://pathTofile");
FileSystem fs = FileSystem.get(context.getConfiguration());
BufferedReader br=new BufferedReader(new InputStreamReader(fs.open(pt)));
try {
  String line;
  line=br.readLine();
  while (line != null){
    System.out.println(line);

    // be sure to read the next line otherwise you'll get an infinite loop
    line = br.readLine();
  }
} finally {
  // you should close out the BufferedReader
  br.close();
}

You can have more than one mapper reading the same file, but there is limit at which it makes more sense to use the Distributed Cache (not only reducing the load on the data nodes which host the blocks for the file but also will be more efficient if you have a job with a larger number of tasks than you have task nodes)

165

answered Nov 11 '22 16:11

Chris White

Related questions
                            
                                Getting raw XML SOAP-response on client side using ADB-stubs created by AXIS2
                            
                                What is the best way to mock DTOs in Java?
                            
                                Switch Statement gives Incompatible Types error
                            
                                Basics - reading/writing remote files using Java
                            
                                SpringMvc Annotations for DAO interface and DAO implementation
                            
                                Converting Message from RabbitMQ into string/json
                            
                                How to create a Java Iterator that throws IOException
                            
                                deep copying a graph structure
                            
                                Where is the declaration of JUnit Matcher#startsWith?
                            
                                How can I access session attribute in Facelets page
                            
                                Correct way to find rowcount in Java JDBC
                            
                                Java HashSet contains duplicates if contained element is modified
                            
                                How to run Java .class file from another .class file? (java newb)
                            
                                How to convert string array to enum array in java
                            
                                Difficulties understanding the renderers mechanism of swing's JTable and JTree
                            
                                What is the /= operator in Java?
                            
                                Can Gradle jar multiple projects into one jar?
                            
                                Why aren't all methods displayed in VisualVM profiler?
                            
                                Counting Disk Intersections using TreeSet
                            
                                JTree set background of node to non-opaque

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Read a text file from HDFS line by line in mapper

Tags:

java

hadoop

hdfs

nik686

People also ask

1 Answers

Chris White

Recent Activity

Donate For Us