I write a Map reduce Job using Java. Set configuration
Configuration configuration = new Configuration();
configuration.set("fs.defaultFS", "hdfs://127.0.0.1:9000");
configuration.set("mapreduce.job.tracker", "localhost:54311");
configuration.set("mapreduce.framework.name", "yarn");
configuration.set("yarn.resourcemanager.address", "localhost:8032");
Run using Different Case
case 1: "Using Hadoop and Yarn command" : Success Fine Work
case 2: "Using Eclipse " : Success Fine Work
case 3: "Using Java -jar after remove all configuration.set() " :
Configuration configuration = new Configuration();
Run successful but not display Job status on Yarn (default port number 8088)
case 4: "Using Java -jar" : Error
Find stack trace:Exception in thread "main" java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:82)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:75)
at org.apache.hadoop.mapreduce.Job$9.run(Job.java:1255)
at org.apache.hadoop.mapreduce.Job$9.run(Job.java:1251)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
at org.apache.hadoop.mapreduce.Job.connect(Job.java:1250)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1279)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
at com.my.cache.run.MyTool.run(MyTool.java:38)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at com.my.main.Main.main(Main.java:45)
I request to you please tell me how to run a map-reduce job using "Java -jar" command and also able to check status and log on Yarn (default port 8088).
Why need: want to create a web service and submit a map-reduce job.(Without using Java runtime library for executing Yarn or Hadoop command ).
In my opinion, it's quite difficult to run hadoop application without hadoop command. You better use hadoop jar than java -jar.
I think you don't have hadoop environment in your machine. First, you must make sure hadoop running well on your machine.
Personally, I do prefer set configuration at mapred-site.xml, core-site.xml, yarn-site.xml, hdfs-site.xml. I know a clear tutorial to install hadoop cluster in here
At this step, You can monitor hdfs in port 50070, yarn cluster in port 8088, mapreduce job history in port 19888.
Then, you should prove your hdfs environtment and yarn environtment running well. For hdfs environment you can try with simple hdfs command like mkdir, copyToLocal, copyFromLocal, etc and for yarn environment you can try sample wordcount project.
After you have hadoop environment, you can create your own mapreduce application (you can use any IDE). probably you need this for tutorial. compile it and make it in jar.
open your terminal, and run this command
hadoop jar <path to jar> <arg1> <arg2> ... <arg n>
hope this helpfull.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With