Questions
Linux
Laravel
Mysql
Ubuntu
Git
Menu
HTML
CSS
JAVASCRIPT
SQL
PYTHON
PHP
BOOTSTRAP
JAVA
JQUERY
R
React
Kotlin
×
Linux
Laravel
Mysql
Ubuntu
Git
New posts in apache-spark
get multiple columns within a map: rdd
Feb 25, 2026
scala
apache-spark
rdd
Python Spark How to find cumulative sum by group using RDD API
Feb 25, 2026
python
apache-spark
pyspark
rdd
Creating a new scala class that relies on GraphFrames without serialization issues
Feb 24, 2026
scala
apache-spark
apache-spark-sql
Spark OutOfMemoryError
Feb 24, 2026
apache-spark
Spark partition by key [duplicate]
Feb 24, 2026
apache-spark
rdd
partitioning
How to find position of substring column in another column using PySpark?
Feb 24, 2026
apache-spark
pyspark
apache-spark-sql
Spark Scala scala.util.control.Exception catching and dropping None in map
Feb 24, 2026
scala
exception
apache-spark
rdd
Can Spark in Foundry use Partition Pruning
Feb 23, 2026
apache-spark
palantir-foundry
Is this a suitable way to implement a lazy `take` on RDD?
Feb 23, 2026
scala
apache-spark
How to List Iceberg Tables in a Catalog
Feb 23, 2026
apache-spark
aws-glue
apache-iceberg
Googld cloud dataproc serverless (batch) pyspark reads parquet file from google cloud storage (GCS) very slow
Feb 22, 2026
apache-spark
google-cloud-platform
google-cloud-storage
google-cloud-dataproc
google-cloud-dataproc-serverless
Avoid shuffling when inserting into sorted iceberg table
Feb 23, 2026
scala
apache-spark
apache-iceberg
Spark 2.0 Scala - Read csv files with escaped delimiters
Feb 23, 2026
csv
apache-spark
« Newer Entries
Older Entries »