Just started Pig; trying to load the data from a file and dump it henceforth. Loading seems to be proper, no error is thrown. Below is the query:
NYSE = LOAD '/root/Desktop/Works/NYSE-2000-2001.tsv' USING PigStorage() AS (exchange:chararray, stock_symbol:chararray, date:chararray, stock_price_open:float, stock_price_high:float, stock_price_low:float, stock_price_close:float, stock_volume:int, stock_price_adj_close:float);
When I try to do the Dump, it throws the following error:
ERROR 1066: Unable to open iterator for alias NYSE org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias NYSE at org.apache.pig.PigServer.openIterator(PigServer.java:857) at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:682) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69) at org.apache.pig.Main.run(Main.java:490) at org.apache.pig.Main.main(Main.java:111) Caused by: java.io.IOException: Job terminated with anomalous status FAILED at org.apache.pig.PigServer.openIterator(PigServer.java:849)"
Any idea what's causing the issue?
Are you running a pig 0.12.0 or earlier jar against hadoop 2.2
, if this is the case then
I managed to get around this error by recompiling the pig jar from src
, here is a summary of the steps involved on a debian
type box
download the pig-0.12.0.tar.gz
unpack the jar and set permissions
then inside the unpacked directory compile the src with 'ant clean jar -Dhadoopversion=23'
then you need to get the jar on your class-path in maven, for example, in the same directory
mvn install:install-file -Dfile=pig.jar -DgroupId={set a groupId}-
DartifactId={set a artifactId} -Dversion=1.0 -Dpackaging=jar
or if in eclipse then add jar as external libary/dependency
I was getting your exact trace trying to run pig 12
in a hadoop 2.2.0
and the above steps worked for me
UPDATE
I posted my issue on the pig jira and they responded. They have a pig jar already compiled for hadoop2 pig-h2.jar here http://search.maven.org/#artifactdetails|org.apache.pig|pig|0.12.0|jar
a maven tag for this jar is
<dependency> <groupId>org.apache.pig</groupId> <artifactId>pig</artifactId> <classifier>h2</classifier> <version>0.12.0</version> <scope>provided</scope> </dependency>
This could be due to a change in the Pig Version starting 0.12. The specific change is that Pig used to be permissive and automatically ignore the first line in the data file or it would interpret that line as column names, in the new version of Pig this permissiveness was removed. The work around is to delete the column names from the input file and this should solve the problem
Kapil
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With