What I want to do is to sum values of a field in all rows in an alias. This must be simple but somehow I can't find the answer. This is probably because what I want is a scalar value while PIG handles datasets? I guess I can create a row with a field which is the sum? Please advise!
This can be achieved using a GROUP ALL to bring everything into a single group, and then the SUM function to add together all the fields:
DESCRIBE a
a: (name, age, height)
b = GROUP a ALL;
c = FOREACH b GENERATE SUM(a.age);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With