I apologize if this is a noob question, but I couldn't find any relevant reference -
what is the difference between these two?
If I'd like to read parquet files from hdfs using pyarrow, which one would I use?
The HdfsClient
API was deprecated, you want to use pyarrow.hdfs.connect
now to connect: http://arrow.apache.org/docs/python/filesystems.html#hadoop-file-system-hdfs
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With