I have a simple pig script, I was able to read the data and dump the data. However, I failed to dump a string or a (string, int) tuple. Just wondering what am I missing here? Thanks a lot!
dataset = LOAD '/Users/me/input' USING PigStorage() AS (id:chararray,data:chararray);
dataset_GROUP = GROUP dataset ALL;
dataset_COUNT = FOREACH dataset_GROUP GENERATE COUNT(dataset);
DUMP "record_count = "; <-- this does not work
DUMP dataset_COUNT; <-- this works
DUMP "record_count = ", dataset_COUNT; <-- this does not work
you can use CONCAT() function of Apache Pig to concat your string to the result as follows:
dataset = LOAD '/Users/me/input' USING PigStorage() AS (id:chararray,data:chararray);
dataset_GROUP = GROUP dataset ALL;
dataset_COUNT = FOREACH dataset_GROUP GENERATE CONCAT('record_count = ', COUNT(dataset));
DUMP dataset_COUNT;
For more details on Concat() of Apache Pig 0.13.0 you can check here
If you are using older Pig version, then you can write your User Defined Function (UDF) which will do the concatination operation and return the result. For more details refer Pig Documentation on UDF
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With