I am new to Hadoop/PIG. I have a basic question.
Do we have a Logging facility in PIG UDF? I have written a UDF which I need to verify I need to log certain statements to check the flow. Is there a Logging facility available? If yes where are the Pig logs present?
These commands are useful to debug a pig script. DUMP - Use the DUMP operator to run (execute) Pig Latin statements and display the results to your screen. ILLUSTRATE - Use the ILLUSTRATE operator to review how data is transformed through a sequence of Pig Latin statements.
The Dump operator is used to run the Pig Latin statements and display the results on the screen. It is generally used for debugging Purpose.
PigUnit is a simple xUnit framework that enables you to easily test your Pig scripts. With PigUnit you can perform unit testing, regression testing, and rapid prototyping. No cluster set up is required if you run Pig in local mode.
Introduction. Pig provides extensive support for user defined functions (UDFs) as a way to specify custom processing. Pig UDFs can currently be implemented in six languages: Java, Jython, Python, JavaScript, Ruby and Groovy. The most extensive support is provided for Java functions.
Assuming your UDF extends EvalFunc
, you can use the Logger returned from EvalFunc.getLogger()
. The log output should be visible in the associated Map / Reduce task that pig executes (if the job executes in more than a single stage then you'll have to pick through them to find the associated log entries).
perhaps obvious, but I advise debugging your UDF in local mode before deploying on a cluster/pseudocluster. This way, you can debug it right inside your IDE (eclipse in my case) which is easier than log-debugging.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With