Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

FileInputStream for a generic file System

I have a file that contains java serialized objects like "Vector". I have stored this file over Hadoop Distributed File System(HDFS). Now I intend to read this file (using method readObject) in one of the map task. I suppose

FileInputStream in = new FileInputStream("hdfs/path/to/file");

wont' work as the file is stored over HDFS. So I thought of using org.apache.hadoop.fs.FileSystem class. But Unfortunately it does not have any method that returns FileInputStream. All it has is a method that returns FSDataInputStream but I want a inputstream that can read serialized java objects like vector from a file rather than just primitive data types that FSDataInputStream would do.

Please help!

like image 524
Akhil Avatar asked May 15 '10 11:05

Akhil


People also ask

What is FileInputStream used for?

A FileInputStream obtains input bytes from a file in a file system. What files are available depends on the host environment. FileInputStream is meant for reading streams of raw bytes such as image data. For reading streams of characters, consider using FileReader .

How do you specify a file path in FileInputStream?

This String should contain the path in the file system to where the file to read is located. Here is a code example: String path = "C:\\user\\data\\thefile. txt"; FileInputStream fileInputStream = new FileInputStream(path);

How do I read a file using FileInputStream?

int read() − This simply reads data from the current InputStream and returns the read data byte by byte (in integer format). This method returns -1 if the end of the file is reached. int read(byte[] b) − This method accepts a byte array as parameter and reads the contents of the current InputStream, to the given array.

Does FileInputStream need to be closed?

close() method. After any operation to the file, we have to close that file.


1 Answers

FileInputStream doesn't give you facitily to read serialized objects directly. You need to wrap it into ObjectInputStream. You can do the same with FSDataInputStream, just wrap it into ObjectInputStream and then you can read your objects from it.

In other words, if you have fileSystem of type org.apache.hadoop.fs.FileSystem, just use:

ObjectInputStream in = new ObjectInputStream(fileSystem.open(path));
like image 110
Peter Štibraný Avatar answered Oct 20 '22 02:10

Peter Štibraný