Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Invalid arguments running parquet-tools jar

Tags:

java

jar

parquet

I'm trying to print one column from a parquet file using parquet-tools.jar (https://github.com/Parquet/parquet-mr/tree/master/parquet-tools). I'm using this command:

java -jar parquet-tools-1.6.1-SNAPSHOT.jar dump -c COLUMNNAME someParquet.parquet

But I get:

Invalid arguments: missing required arguments

usage: parquet-dump [option...] <input>
where option is one of:
    -c,--column <arg>  Dump only the given column, can be specified more than
                       once
    -d,--disable-data  Do not dump column data
       --debug         Enable debug output
    -h,--help          Show this help string
    -m,--disable-meta  Do not dump row group and page metadata
       --no-color      Disable color output even if supported
where <input> is the parquet file to print to stdout

Not sure where I'm getting the syntax wrong.

like image 303
covfefe Avatar asked Nov 18 '16 18:11

covfefe


1 Answers

Option -c,--column is thinking that you have specified multiple columns as arguments for "dump" commnad and ending up in eating all arguments. Hence you are seeing the missing requirement argument exception.

One workaround solution, i can suggest that you need to add one additional option just after the -c option. This will make CLI parser to stop eating unexpected arguments for -c option.

With Below command(added --debug option), you should be able to execute the program:

java -jar parquet-tools-1.6.1-SNAPSHOT.jar dump -c COLUMNNAME --debug someParquet.parquet

You can try --no-color instead of --debug too.

Hope this helps.

like image 100
skadya Avatar answered Nov 11 '22 00:11

skadya