apache-spark tutorials and guides

Partitioning by multiple columns in Spark SQL

Sep 22, 2022

apache-spark apache-spark-sql window-functions

AttributeError: 'SparkContext' object has no attribute 'createDataFrame' using Spark 1.6

Oct 21, 2020

python hadoop apache-spark

Spark Dataframe Nested Case When Statement

Nov 12, 2022

sql apache-spark dataframe apache-spark-sql

Spark: Programmatically creating dataframe schema in scala

Mar 13, 2022

scala apache-spark schema dataframe

How to get the correlation matrix of a pyspark data frame?

Jul 16, 2022

apache-spark pyspark

Spark - scala: shuffle RDD / split RDD into two random parts randomly

Feb 22, 2022

scala apache-spark rdd

Spark streaming custom metrics

May 25, 2022

java apache-spark jmx spark-streaming codahale-metrics

Reading csv files in zeppelin using spark-csv

Oct 26, 2022

apache-spark apache-zeppelin

Check Type: How to check if something is a RDD or a DataFrame?

Nov 07, 2019

python apache-spark dataframe apache-spark-sql rdd

How to fix spark-shell on Windows (fails with "was unexpected at this time")? [closed]

Sep 26, 2022

apache-spark

No module named 'resource' installing Apache Spark on Windows

Oct 16, 2022

python windows apache-spark

how to check if a string column in pyspark dataframe is all numeric

Mar 06, 2022

python apache-spark pyspark apache-spark-sql numeric

Spark: How to save a dataframe with headers?

Aug 31, 2022

java apache-spark

How to convert a table into a Spark Dataframe

Apr 09, 2022

apache-spark pyspark apache-spark-sql spark-dataframe

java.lang.NoClassDefFoundError: org/apache/spark/Logging

Apr 01, 2022

java maven apache-spark cassandra spark-cassandra-connector

TaskSchedulerImpl: Initial job has not accepted any resources;

Dec 29, 2018

java apache-spark cassandra datastax

ERROR yarn.ApplicationMaster: Uncaught exception: java.util.concurrent.TimeoutException: Futures timed out after 100000 milliseconds [duplicate]

Sep 29, 2021

apache-spark akka apache-spark-sql

Count number of words in a spark dataframe

Oct 23, 2022

python apache-spark pyspark apache-spark-sql

Spark 2: how does it work when SparkSession enableHiveSupport() is invoked

Nov 18, 2022

apache-spark hive apache-spark-sql hiveql

Mock a Spark RDD in the unit tests

Mar 18, 2022

scala unit-testing mocking apache-spark scalatest

New posts in apache-spark