In Spark download page we can choose between releases 3.0.0-preview and 2.4.4.
For release 3.0.0-preview there are the package types
For release 2.4.4 there are the package types
Since there isn't a Pre-built for Apache Hadoop 3.1.2 option, can I download a Pre-built with user-provided Apache Hadoop package or should I download Source code?
If you are comfortable building source code, then that is your best option.
Otherwise, you already have a Hadoop cluster, so pick "user-provided" and copy your relevant core-site.xml, hive-site.xml, yarn-site.xml, and hdfs-site.xml all into the $SPARK_CONF_DIR, and it hopefully mostly will work
Note: DataFrames don't work on Hadoop 3 until Spark 3.x - SPARK-18673
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With