apache-spark tutorials and guides

Is there a way to limit String Length in a spark dataframe Type?

Sep 08, 2025

dataframe apache-spark

how to extract the column name and data type from nested struct type in spark

Sep 08, 2025

scala apache-spark

Addressing categorical features with one hot encoding and vector assembler vs vector indexer

Sep 08, 2025

scala apache-spark machine-learning categorical-data apache-spark-ml

How to read text file using Scala(spark) line by line and split using delimiter and store values in respective columns? [duplicate]

Sep 08, 2025

scala apache-spark

Spark shuffle read takes significant time for small data

Sep 08, 2025

scala apache-spark shuffle

AnalysisException: u'Cannot resolve column name

Sep 08, 2025

apache-spark pyspark apache-spark-sql

pyspark - Error while loading .csv file from url to Spark

Sep 08, 2025

python apache-spark pyspark py4j

What's the difference between Apache Ignite and Tachyon

Sep 07, 2025

apache-spark ignite alluxio

spark structured streaming: not writing correctly

Sep 05, 2025

python apache-spark spark-structured-streaming

VSCode Extension Databricks-Connect - Use SparkSession

Sep 08, 2025

apache-spark visual-studio-code databricks databricks-connect databricks-vscode-extension

How to access global temp view in another pyspark application?

Sep 08, 2025

apache-spark pyspark apache-spark-sql

Sum vector columns in spark

Sep 07, 2025

scala apache-spark vector

How to calculate a Directory size in ADLS using PySpark?

Sep 08, 2025

python apache-spark pyspark databricks azure-databricks

Create array containing first element of each struct in an array in a Spark dataframe field

Sep 06, 2025

apache-spark pyspark apache-spark-sql

Spark - How to add a StructField at the beginning of a StructType in scala

Sep 07, 2025

scala apache-spark

Error while saving data to elasticsearch from spark - saveToEs

Sep 07, 2025

elasticsearch apache-spark

Usage of spark._jsparkSession.catalog().tableExists() in pyspark

Sep 07, 2025

apache-spark pyspark delta-lake hive-metastore

Pyspark remove field in struct column

Sep 07, 2025

dataframe apache-spark pyspark apache-spark-sql databricks

PySpark equivalent of adding a constant array to a dataframe as column

Sep 07, 2025

arrays dataframe apache-spark pyspark runtimeexception

New posts in apache-spark