I'm loading a tsv file with a datetime column and long column with:
A = LOAD 'tweets-clean.txt' USING PigStorage('\t') AS (date:datetime, userid:long);
DUMP A;
An example line of input:
Tue Feb 11 05:02:10 +0000 2014 205291417
that line of output:
, 205291417
How do I do this properly?
You'd want to load date as a chararray (date:chararray) and then can convert it to to a datetime using FOREACH GENERATE
along with the ToDate
Pig built-in function.
The format string is based on the SimpleDateFormat
A = LOAD 'tweets-clean.txt' USING PigStorage('\t') AS (date:chararray, userid:long);
B = FOREACH A GENERATE ToDate(date, '<some format string>') AS date, userid;
DUMP B;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With