Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

concatenate a string to a field in pig

I like to concat a string to all data in a field?

example a dataset mydata contains following field ( id, name, email ) i like to add a prefix of string test to all the data in the field name.

I tried

a = load 'mydata.csv' as (id, name, email);
b = foreach a generate id, concat('test', chararray(name)); 

i'm getting empty results on this

any thoughts ?

like image 979
suzil Avatar asked Jan 30 '15 00:01

suzil


1 Answers

  1. In pig concat keyword should be in Capital letters not small letters. You need to change the keyword concat to CONCAT.
  2. You are loading a CSV file with default delimiter(tab). Are you sure that your csv file is tab separate delimiter for each field? other wise you will get a weird result. Incase your csv file is comma separated delimiter then specify the explicit delimiter as comma in the PigStorage.
  3. Its always safe to specify the schema during load, it will avoid unnecessary explicit typecast.

Sample example:

input.csv

1,aaa,[email protected]
2,bbb,[email protected]
3,ccc,[email protected]

PigScript:

a = load 'input.csv' using PigStorage(',') as (id:int, name:chararray, email:chararray);
b = foreach a generate id, CONCAT('test', name);
DUMP b;

Output:

(1,testaaa)
(2,testbbb)
(3,testccc)

Incase your csv file is already tab separated delimiter then fix only the CONCAT issue.

like image 64
Sivasakthi Jayaraman Avatar answered Sep 28 '22 15:09

Sivasakthi Jayaraman