Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dump a string or a (string, integer) tuple in pig

I have a simple pig script, I was able to read the data and dump the data. However, I failed to dump a string or a (string, int) tuple. Just wondering what am I missing here? Thanks a lot!


dataset = LOAD '/Users/me/input' USING PigStorage() AS (id:chararray,data:chararray);

dataset_GROUP = GROUP dataset ALL;
dataset_COUNT = FOREACH dataset_GROUP GENERATE COUNT(dataset);

DUMP "record_count = ";                 <-- this does not work
DUMP dataset_COUNT;                     <-- this works 
DUMP "record_count = ", dataset_COUNT;  <-- this does not work
like image 720
Edamame Avatar asked Apr 15 '26 17:04

Edamame


1 Answers

you can use CONCAT() function of Apache Pig to concat your string to the result as follows:

dataset = LOAD '/Users/me/input' USING PigStorage() AS (id:chararray,data:chararray);

dataset_GROUP = GROUP dataset ALL;
dataset_COUNT = FOREACH dataset_GROUP GENERATE CONCAT('record_count = ', COUNT(dataset));

DUMP dataset_COUNT;

For more details on Concat() of Apache Pig 0.13.0 you can check here

If you are using older Pig version, then you can write your User Defined Function (UDF) which will do the concatination operation and return the result. For more details refer Pig Documentation on UDF

like image 185
Prasad Khode Avatar answered Apr 19 '26 10:04

Prasad Khode



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!