Apache Parquet Could not read footer: java.io.IOException:

Question

I have a SPARK project running on a Cloudera VM. On my project I load the data from a parquet file and then process these data. Everything works fine but The problem is that I need to run this project on a school cluster but there I am having problems while reading the parquet file at this part of code:

DataFrame schemaRDF = sqlContext.parquetFile("/var/tmp/graphs/sib200.parquet");

I get the following error:

Could not read footer: java.io.IOException: Could not read footer for file FileStatus{path=file:/var/tmp/graphs/sib200.parquet/_common_metadata; isDirectory=false; length=413; replication=0; blocksize=0; modification_time=0; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} at parquet.hadoop.ParquetFileReader.readAllFootersInParallel(ParquetFileReader.java:248) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$$anonfun$28.apply(ParquetRelation.scala:750)

Based on the search online it seems to be a parquet version problem.

What I would like from you is to tell me how can I find the installed parquet version in a computer in order to check if both have the same version. Or in addition, if you know the exact solution for this error would also be perfect!

Bruno Faria · Accepted Answer

I got the same problem trying to read a parquet file from S3. In my case the issue was the required libraries were not available for all workers in the cluster.

There are 2 ways to fix that:

Make sure you added the dependencies on the spark-submit command so it's distributed to the whole cluster
Add the dependencies on the /jars directory on your SPARK_HOME for each worker in the cluster.

Apache Parquet Could not read footer: java.io.IOException:

Tags:

java

io

apache-spark

hadoop

parquet

Lavdërim Shala

1 Answers

Bruno Faria

Recent Activity

Donate For Us

Apache Parquet Could not read footer: java.io.IOException:

Tags:

java

io

apache-spark

hadoop

parquet

Lavdërim Shala

1 Answers

Bruno Faria

Related questions

Recent Activity

Donate For Us