Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hadoop profile output - where and what?

I'm trying to profile my application to see if I can reproduce this blogpost. I added -D mapred.task.profile=true to the command line and checked in the job configuration that it took.

Hadoop: The Definitive Guide says the profile info will appear in the Unix dir I ran the job from. The dir I started from has a file attempt_201305011806_0042_m_000002_0.profile, which is correct job ID but there wasn't a mapper #2 (only 1 mapper and it didn't fail). The output only has the header info in the profile file; there isn't any actual profiling info.

The Hadoop docs say the output will be in the user log directory but I can't find anything. If I go into the task logs for the mapper, there's profiling info under "profile.out logs" with legitimate info. My HDFS output dir doesn't have the profiling info at all. Shouldn't the profiling output be in HDFS somewhere?

Also, it only gives text-based output in the log but all of the tools I've found to visualize the profile assume binary hprof format. Any ideas for how I could get a binary profile or else load a text-based profile into an hprof tool?

like image 847
Keith Avatar asked May 07 '13 16:05

Keith


1 Answers

I noticed there's a space at

-D mapred.task.profile=true

Is that a typo? If yes, just remove it and see what happens. Also, you should be able to see a profiler files under the user log directory, which is usually where you ran the job from. Also, hprof is the default for hadoop, so check if you are not overwriting it with

-Dmapred.task.profile.params
like image 92
chaos Avatar answered Sep 21 '22 14:09

chaos