I am not looking for these so-called "debugging" solutions which rely on println. I mean to attach a real debugger to a running Hadoop instance, and debugging it from a different machine.
Is this possible? How? jdb?
On the remote computer, find and start the Remote Debugger from the Start menu. If you don't have administrative permissions on the remote computer, right-click the Remote Debugger app and select Run as administrator. Otherwise, just start it normally.
Starting the Application With Remote Debugging Enabledjar + start a server socket at port 8998 and publish the debugging messages using the Java Debug Wire Protocol (jdwp) there. Other than address, server, and transport, there are other sub-options available for -Xrunjdwp option — for example: suspend.
You can attach the Visual Studio debugger to a running process on a local or remote computer. After the process is running, select Debug > Attach to Process or press Ctrl+Alt+p in Visual Studio, and use the Attach to Process dialog to attach the debugger to the process.
A nicely given at LINK
To debug task tracker, do following steps.
Edit conf/hadoop-env.sh to have following
export HADOOP_TASKTRACKER_OPTS="-Xdebug -Xrunjdwp:transport=dt_socket,address=5000,server=y,suspend=n"
Start Hadoop (bin/start-dfs.sh and bin/start-mapred.sh)
I've never done it that way as I'd rather my "real" jobs run unhindered by debug-overhead (which can, under circumstances, change the environment conditions anyway): I debug "locally" against a pseudo-instance (normal debugging in eclipse is absolutely no problem), copying specific files from the live environment once I've isolated (by using e.g. counters) where the problem lies.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With