Questions
Linux
Laravel
Mysql
Ubuntu
Git
Menu
HTML
CSS
JAVASCRIPT
SQL
PYTHON
PHP
BOOTSTRAP
JAVA
JQUERY
R
React
Kotlin
×
Linux
Laravel
Mysql
Ubuntu
Git
New posts in apache-spark
Removing empty strings from maps in scala
Mar 07, 2022
scala
apache-spark
idea sbt java.lang.NoClassDefFoundError: org/apache/spark/SparkConf
Apr 02, 2022
scala
apache-spark
sbt
How to construct Dataframe from a Excel (xls,xlsx) file in Scala Spark?
Oct 20, 2022
excel
scala
apache-spark
apache-spark-sql
spark-excel
"Bad substitution" when submitting spark job to yarn-cluster
Sep 19, 2022
apache-spark
hadoop-yarn
PySpark: when function with multiple outputs [duplicate]
Sep 11, 2022
python
apache-spark
pyspark
pyspark-sql
Convert pyspark.sql.dataframe.DataFrame type Dataframe to Dictionary
Aug 25, 2022
python
dictionary
apache-spark
pyspark
Spark LDA consumes too much memory
Apr 02, 2021
apache-spark
apache-spark-mllib
lda
apache spark "Py4JError: Answer from Java side is empty"
Nov 15, 2021
apache-spark
SparkUI for pyspark - corresponding line of code for each stage?
Sep 19, 2022
apache-spark
pyspark
emr
How to read/write protocol buffer messages with Apache Spark?
Sep 07, 2022
apache-spark
hdfs
protocol-buffers
sequencefile
In Apache Spark, how to convert a slow RDD/dataset into a stream?
Sep 19, 2022
scala
apache-spark
apache-spark-sql
spark-streaming
What is happening when Spark is calling ShuffleBlockFetcherIterator?
Sep 14, 2022
apache-spark
apache-spark-sql
spark parquet write gets slow as partitions grow
Sep 14, 2022
apache-spark
partitioning
parquet
Unable to understand error "SparkListenerBus has already stopped! Dropping event ..."
May 26, 2021
apache-spark
How are number of iterations and number of partitions releated in Apache spark Word2Vec?
Aug 19, 2021
apache-spark
apache-spark-mllib
word2vec
Spark: Difference between collect(), take() and show() outputs after conversion toDF
Sep 19, 2022
scala
apache-spark
dataframe
collect
take
Spark: Most efficient way to sort and partition data to be written as parquet
Nov 17, 2022
apache-spark
pyspark
apache-spark-sql
pyspark-sql
Why increase spark.yarn.executor.memoryOverhead?
Aug 17, 2022
apache-spark
hadoop-yarn
« Newer Entries
Older Entries »