Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unable to get parquet-tools working from the command-line

Tags:

parquet

I'm attempting to get the newest version of parquet-tools running, but I'm having some issues. For some reason org.apache.hadoop.conf.Configuration isn't in the shaded jar. (I have the same issue with v1.6.0 as well).

Is there something beyond mvn package or mvn install that I should be doing? (The actual mvn invocation I'm using is mvn install -DskipTests -pl \!parquet-thrift,\!parquet-cascading,\!parquet-pig-bundle,\!parquet-pig,\!parquet-scrooge,\!parquet-hive,\!parquet-protobuf). This works just fine, and the tests pass if I choose to run them.

The error I get is below (You can see I've attempted to stick the hadoop jar from an old parquet version that seemed to bundle it into the classpath; I get the same results with or without it).

> java -classpath /path/to/hadoop-core-1.1.0.jar -jar parquet-tools-1.7.0-incubating-SNAPSHOT.jar meta --debug part-r-00000.gz.parquet

java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
    at parquet.tools.command.ShowMetaCommand.execute(ShowMetaCommand.java:59)
    at parquet.tools.Main.main(Main.java:222)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration
    at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 2 more
org/apache/hadoop/conf/Configuration
like image 502
Isaac Avatar asked Nov 28 '22 07:11

Isaac


1 Answers

On MacOS using homebrew, this is the easiest way to get started:

$ brew install parquet-tools
like image 78
Jan Kronquist Avatar answered Nov 30 '22 21:11

Jan Kronquist