Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MalformedURLException on reading file from HDFS

Tags:

java

hadoop

I have following test program to read a file from HDFS.

public class FileReader {
    public static final String NAMENODE_IP = "172.32.17.209";
    public static final String FILE_PATH = "/notice.html";

    public static void main(String[] args) throws MalformedURLException,
            IOException {
        String url = "hdfs://" + NAMENODE_IP + FILE_PATH;

        InputStream is = new URL(url).openStream();
        InputStreamReader isr = new InputStreamReader(is);
        BufferedReader br = new BufferedReader(isr);
        String line = br.readLine();
        while(line != null) {
            System.out.println(line);
            line = br.readLine();
        }
    }
}

It is giving java.net.MalformedURLException

Exception in thread "main" java.net.MalformedURLException: unknown protocol: hdfs
    at java.net.URL.<init>(URL.java:592)
    at java.net.URL.<init>(URL.java:482)
    at java.net.URL.<init>(URL.java:431)
    at in.ksharma.hdfs.FileReader.main(FileReader.java:29)
like image 900
Kshitiz Sharma Avatar asked Sep 22 '14 09:09

Kshitiz Sharma


People also ask

How to read from a file in HDFS?

Consider the figure: Step 1: The client opens the file it wishes to read by calling open () on the File System Object (which for HDFS is an instance of Distributed File System). Step 2: Distributed File System ( DFS) calls the name node, using remote procedure calls (RPCs), to determine the locations of the first few blocks in the file.

How internally read operation is carried out in Hadoop HDFS?

Now let us see how internally read operation is carried out in Hadoop HDFS, how data flows between the client, the NameNode, and DataNodes during file read. In order to open the required file, the client calls the open () method on the FileSystem object, which for HDFS is an instance of DistributedFilesystem.

What is malformedurlexception in Java?

If the url you have passed in the string which cannot be parsed or, without legal protocol a MalformedURLException is generated. In the following Java example we are tring to get establish a connection to a page and publishing the response.

How do I edit a file in HDFS?

HDFS follows Write Once Read Many philosophies. So we cannot edit files already stored in HDFS, but we can append new data to these files by re-opening them. To read the files stored in HDFS, the HDFS client interacts with the NameNode and DataNode.


3 Answers

Register Hadoop's Url handler. Standard Url handler won't know how to handle hdfs:// scheme.

Try this:

public static void main(String[] args) throws MalformedURLException,
            IOException {
        URL.setURLStreamHandlerFactory(new FsUrlStreamHandlerFactory());

        String url = "hdfs://" + NAMENODE_IP + FILE_PATH;

        InputStream is = new URL(url).openStream();
        InputStreamReader isr = new InputStreamReader(is);
        BufferedReader br = new BufferedReader(isr);
        String line = br.readLine();
        while(line != null) {
            System.out.println(line);
            line = br.readLine();
        }
    }
like image 67
Kshitiz Sharma Avatar answered Oct 01 '22 17:10

Kshitiz Sharma


I get the same issue while writing a Java application for reading from hdfs on hadoop 2.6. My solution is : Add

 hadoop-2.X/share/hadoop/hdfs/hadoop-hdfs-2.X.jar to your classpath.
like image 40
Jason Avatar answered Sep 30 '22 17:09

Jason


In our case we had to combine it with other answer:
https://stackoverflow.com/a/21118824/1549135

So firstly in our HDFS setup class (Scala code):

val hadoopConfig: Configuration = new Configuration()
hadoopConfig.set("fs.hdfs.impl", classOf[DistributedFileSystem].getName)
hadoopConfig.set("fs.file.impl", classOf[LocalFileSystem].getName)

And later, like in accepted answer:
https://stackoverflow.com/a/25971334/1549135

URL.setURLStreamHandlerFactory(new FsUrlStreamHandlerFactory)
Try(new URL(path))

Side note:

We already had: "org.apache.hadoop" % "hadoop-hdfs" % "2.8.0" in our dependencies and it did not help.

like image 39
Atais Avatar answered Sep 30 '22 17:09

Atais