Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Define tuple datas in the pig script

I am currently debugging a pig script. I'd like to define a tuple in the Pig file directly (instead of the basic "Load" function).

Is there a way to do it?

I am looking for something like that:

A= ('name#bob'','age#29';'name#paul','age#12')

The dump Will return :

('bob',29)
('paul',12)
like image 542
romain-nio Avatar asked Sep 14 '12 11:09

romain-nio


People also ask

What is the tuple data type in Pig?

An ordered list of Data. A tuple has fields, numbered 0 through (number of fields - 1). The entry in the field can be any datatype, or it can be null.

What are data types available in Pig?

Pig has three complex data types: maps, tuples, and bags. All of these types can contain data of any type, including other complex types. So it is possible to have a map where the value field is a bag, which contains a tuple where one of the fields is a map.

How does Pig display data?

Now load the data from the file student_data. txt into Pig by executing the following Pig Latin statement in the Grunt shell. grunt> student = LOAD 'hdfs://localhost:9000/pig_data/student_data.txt' USING PigStorage(',') as ( id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray );

How do you load map data in Pig?

Currently pig maps need the key to a chararray (string) that you supply and not a variable which contains a string. so in map#key the key has to be constant string that you supply (eg: map#'keyvalue').


2 Answers

It is in fact impossibble to do this in pig as it currently stands. If you just want to debug create a file in hadoop and load that. Write the data you want into the file (whatever you would have created manually had it been possibble) and upload it. Then load it using pig.

like image 128
Davis Broda Avatar answered Oct 29 '22 19:10

Davis Broda


The following (dirty) trick do the job: - create a file With one empty row ans store it to your HDFS. - load it : Line = load /user/toto/onelinefile USING .. - create own datas : foreach line generate 'bob' as name, 22 as age;

like image 24
romain-nio Avatar answered Oct 29 '22 21:10

romain-nio