Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parquet Reading file gives java.net.URISyntaxException: Relative path in absolute URI

I have parquet files that are uploaded to S3 and I downloaded them to a folder. The following code gives an error:

Configuration conf = new Configuration();
Path path = new Path("/Users/mustafa/pqs/2018-02-16T08:30:23.570-d629f5af-23b8-44c7-bc41-ce6ad98b16cd.parquet");
ParquetFileReader file = ParquetFileReader.open(conf, path);
System.out.println(file.getFileMetaData().getSchema());

It tries to locate CRC files, but they are not present. But it gives a weir error.

Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: .2018-02-16T08:30:23.570-d629f5af-23b8-44c7-bc41-ce6ad98b16cd.parquet.crc
    at org.apache.hadoop.fs.Path.initialize(Path.java:205)
    at org.apache.hadoop.fs.Path.<init>(Path.java:171)
    at org.apache.hadoop.fs.Path.<init>(Path.java:93)
    at org.apache.hadoop.fs.ChecksumFileSystem.getChecksumFile(ChecksumFileSystem.java:90)
    at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:145)
    at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:346)
    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:769)
    at org.apache.parquet.hadoop.ParquetFileReader.<init>(ParquetFileReader.java:589)
    at org.apache.parquet.hadoop.ParquetFileReader.<init>(ParquetFileReader.java:575)
    at org.apache.parquet.hadoop.ParquetFileReader.open(ParquetFileReader.java:506)
    at com.opsgenie.sre.logmanagement.merge.ParquetMerger.main(ParquetMerger.java:11)
Caused by: java.net.URISyntaxException: Relative path in absolute URI: .2018-02-16T08:30:23.570-d629f5af-23b8-44c7-bc41-ce6ad98b16cd.parquet.crc
    at java.net.URI.checkPath(URI.java:1823)
    at java.net.URI.<init>(URI.java:745)
    at org.apache.hadoop.fs.Path.initialize(Path.java:202)
    ... 10 more

But if I move the filename to a simple name osman.parquet it works. Why the file name 2018-02-16T08:30:23.570-d629f5af-23b8-44c7-bc41-ce6ad98b16cd.parquet is making it go nuts?

like image 530
Mustafa Avatar asked Sep 13 '25 12:09

Mustafa


1 Answers

The library doesn't allow a : in the path name. Obviously this could be handled better.

https://issues.apache.org/jira/browse/HDFS-13

like image 75
user133831 Avatar answered Sep 15 '25 01:09

user133831