Questions
Linux
Laravel
Mysql
Ubuntu
Git
Menu
HTML
CSS
JAVASCRIPT
SQL
PYTHON
PHP
BOOTSTRAP
JAVA
JQUERY
R
React
Kotlin
×
Linux
Laravel
Mysql
Ubuntu
Git
New posts in apache-spark
Read parquet with binary (proto-buffer) column
Apr 03, 2026
apache-spark
protocol-buffers
parquet
How do you get batches of rows from Spark using pyspark
Apr 01, 2026
python
apache-spark
pyspark
rdd
spark: case sensitive partitionBy column
Apr 02, 2026
apache-spark
hive
apache-spark-sql
SparkSQL - got duplicate rows after join & groupBy
Apr 02, 2026
apache-spark
apache-spark-sql
Splitting and RDD row to different column in Pyspark
Apr 02, 2026
python
apache-spark
pyspark
row
rdd
Can a cpu core run multiple applications concurrently on spark cluster?
Apr 02, 2026
apache-spark
Apache Spark: Streaming without HDFS checkpoint
Apr 02, 2026
apache-spark
hdfs
spark-streaming
Airflow - Unable to import Spark provider - package: name 'client' is not defined
Apr 02, 2026
python
apache-spark
pip
airflow
How to pass spark parameter to a dataproc workflow template?
Apr 01, 2026
apache-spark
google-cloud-platform
pyspark
google-cloud-dataproc
Submit a spark job from Airflow to external spark container
Mar 31, 2026
docker
apache-spark
airflow
Turn multiple rows of events with timestamps in a dataframe to single row with start and end datetime
Apr 02, 2026
python
apache-spark
pyspark
Spark Datasets available in Python?
Mar 31, 2026
apache-spark
pyspark
spark scala long converts to timestamp with milliseconds in parquet dataframe
Mar 30, 2026
scala
date
apache-spark
unix-timestamp
how to add a jar to python notebook on bluemix spark?
Apr 01, 2026
python
apache-spark
ipython
ibm-cloud
jupyter-notebook
Splitting row in multiple row in spark-shell
Apr 01, 2026
scala
apache-spark
dataframe
apache-spark-sql
« Newer Entries
Older Entries »