Questions
Linux
Laravel
Mysql
Ubuntu
Git
Menu
HTML
CSS
JAVASCRIPT
SQL
PYTHON
PHP
BOOTSTRAP
JAVA
JQUERY
R
React
Kotlin
×
Linux
Laravel
Mysql
Ubuntu
Git
New posts in apache-spark
How does the Apache Spark scheduler split files into tasks?
May 25, 2022
apache-spark
bigdata
How to let Spark serialize an object using Kryo?
Sep 14, 2022
serialization
apache-spark
kryo
Spark job failing when calling first() in PySpark
Oct 16, 2022
java
python
apache-spark
pyspark
Apache Spark ALS recommendations approach
Apr 03, 2020
apache-spark
machine-learning
bigdata
recommendation-engine
apache-spark-mllib
In Apache Spark SQL, How to close metastore connection from HiveContext
Oct 17, 2022
apache-spark
thrift
apache-spark-sql
apache-spark-1.4
must build Spark with Hive (spark 1.5.0)
Jun 05, 2022
python
maven
apache-spark
hive
spark-dataframe
Spark partitionBy much slower than without it
Sep 15, 2022
scala
apache-spark
apache-spark-sql
parquet
Combining PyCharm, Spark and Jupyter
Sep 05, 2022
apache-spark
pycharm
pyspark
jupyter
How to enable streaming from Cassandra to Spark?
Oct 31, 2022
apache-spark
cassandra
pyspark
spark-streaming
datastax
pySpark: Save ML Model
Nov 24, 2017
apache-spark
machine-learning
pyspark
Spark Job submitted - Waiting (TaskSchedulerImpl : Initial job not accepted)
Feb 05, 2022
api
apache-spark
amazon-ec2
Spark performance tuning - number of executors vs number for cores
Aug 30, 2022
apache-spark
spark-streaming
Spark Dataframe Maximum Column Count
Apr 02, 2022
apache-spark
pyspark
apache-spark-sql
Run Spark-shell with error :SparkContext: Error initializing SparkContext
Nov 10, 2019
hadoop
apache-spark
hdfs
Spark num-executors
Nov 20, 2022
apache-spark
hadoop-yarn
hortonworks-data-platform
Spark SQL: INSERT INTO statement syntax
Sep 19, 2022
apache-spark
apache-spark-sql
Cannot create temp dir with proper permission: /mnt1/s3
Jan 07, 2021
amazon-web-services
apache-spark
amazon-s3
amazon-emr
Pyspark 1.6 - Aliasing columns after pivoting with multiple aggregates
Nov 14, 2022
python-2.7
apache-spark
pivot
pyspark
pyspark-sql
Apache Spark read file as a stream from HDFS
Nov 14, 2022
java
apache-spark
hdfs
"GC overhead limit exceeded" on cache of large dataset into spark memory (via sparklyr & RStudio)
Jan 21, 2021
r
apache-spark
cassandra
sparklyr
« Newer Entries
Older Entries »