apache-spark tutorials and guides

Changing Nulls Ordering in Spark SQL

Nov 16, 2022

apache-spark apache-spark-sql

Use more than one collect_list in one query in Spark SQL

Apr 22, 2022

scala apache-spark hive apache-spark-sql

How to convert an RDD of Maps to dataframe

Nov 16, 2022

scala apache-spark apache-spark-sql

How to write into PostgreSQL hstore using Spark Dataset

Nov 07, 2022

postgresql jdbc apache-spark spark-dataframe hstore

How to access Spark Web UI?

Aug 24, 2022

apache-spark

Reading CSV file in Spark in a distributed manner

Mar 21, 2020

csv apache-spark distributed

Reading Avro File in Spark

Sep 15, 2022

scala apache-spark apache-spark-sql apache-zeppelin

Running Spark driver program in Docker container - no connection back from executor to the driver?

Oct 14, 2022

docker apache-spark mesos apache-spark-standalone

Drop if all entries in a spark dataframe's specific column is null

May 06, 2022

python apache-spark pyspark

How to add a column to the beginning of the schema?

Sep 15, 2022

scala apache-spark apache-spark-sql

spark [dataframe].write.option("mode","overwrite").saveAsTable("foo") fails with 'already exists' if foo exists

Nov 16, 2022

sql scala apache-spark overwrite

how to use jni in spark?

Nov 01, 2020

java-native-interface apache-spark java.library.path

saveTocassandra could not find implicit value for parameter rwf

Dec 15, 2020

scala cassandra apache-spark

how to print out snippets of a RDD in the spark-shell / pyspark?

Oct 16, 2022

apache-spark pyspark

Permission denied when starting spark Command line on AWS EMR cluster

Dec 02, 2019

amazon-web-services apache-spark emr

Spark 1.6.1 S3 MultiObjectDeleteException

Feb 07, 2022

apache-spark amazon-s3 spark-streaming

Spark - Datediff for months?

Oct 28, 2022

java apache-spark

Is querying against a Spark DataFrame based on CSV faster than one based on Parquet?

Apr 02, 2019

apache-spark apache-spark-sql spark-dataframe parquet

sparksql drop hive table

Jun 02, 2022

apache-spark apache-spark-sql pyspark-sql

Connect sparklyr to remote spark connection

Nov 15, 2022

r apache-spark sparklyr

New posts in apache-spark