After loading and grouping records, how can I store those grouped records into several files, one per group (=userid)?
records = LOAD 'input' AS (userid:int, ...);
grouped_records = GROUP records BY userid;
I'm using Apache Pig version 0.8.1-cdh3u3 (rexported)
Indeed, there is a MultiStorage class at Piggybank which does exactly what I want - it splits the records by a specified attribute (at index '0' in my example):
STORE records INTO 'output' USING org.apache.pig.piggybank.storage.MultiStorage('output', '0', 'none', ',');
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With