I am using Spark SQL to read in a csv, I also get a lot of such messages:
...some.csv, range: 20971520-24311915, partition values: [empty row]
Why does it say it's empty row? Is the partition real empty?
Neither the file nor the Spark partition with data read from the file is empty.
The log message may be a bit confusing because of two things:
/path/to/partition/a=1/b=hello/c=3.14
they would be a
, b
and c
, and their values: 1
, hello
and 3.14
. They can also come from the Hive Metastore in case of partitioned external tables.InternalRow
, not in a collection.In your case, the directory structure is flat or it does not contain partition names (e.g. /path/to/partition/1/hello/3.14
), so there are no Hive-style partitions and you see [empty row]
in the message as a result.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With