Questions
Linux
Laravel
Mysql
Ubuntu
Git
Menu
HTML
CSS
JAVASCRIPT
SQL
PYTHON
PHP
BOOTSTRAP
JAVA
JQUERY
R
React
Kotlin
×
Linux
Laravel
Mysql
Ubuntu
Git
New posts in apache-spark
Apache Spark: ERROR local class incompatible when initiating a SparkContext class
Apr 13, 2020
java
scala
apache-spark
version
Saving / exporting transformed DataFrame back to JDBC / MySQL
Apr 11, 2022
apache-spark
apache-spark-sql
apache-spark-1.5
Basic linear algebra on spark matrices
Jun 18, 2022
python
matrix
apache-spark
Connecting/Integrating Cassandra with Spark (pyspark)
Oct 14, 2021
cassandra
apache-spark
pyspark
How to know when to repartition/coalesce RDD with unbalanced partitions (without shuffling possibly)?
May 19, 2022
apache-spark
Error from python worker: /bin/python: No module named pyspark
Mar 11, 2022
python
apache-spark
ipython
ipython-notebook
pyspark
Spark - Difference between sortBy and sortByKey
Jun 09, 2022
apache-spark
Connecting IPython notebook to spark master running in different machines
Mar 02, 2021
apache-spark
ipython
kubernetes
google-kubernetes-engine
google-cloud-dataproc
Spark - How can get the Logical / Physical Query execution using - Thirft - Hive Interactor
Jan 30, 2022
apache-spark
apache-spark-sql
spark-dataframe
Spark DataFrame not respecting schema and considering everything as String
Jul 12, 2020
scala
apache-spark
apache-spark-sql
apache-spark-mllib
scala-collections
Spark Is there any rule of thumb about the optimal number of partition of a RDD and its number of elements?
Oct 01, 2022
apache-spark
apache-spark-sql
partitioning
Spark sql top n per group
Apr 22, 2022
apache-spark
group-by
apache-spark-sql
top-n
org.apache.thrift.transport.TTransportException error while Reading large JSON file in zeppelin scala
Aug 18, 2021
json
scala
apache-spark
apache-zeppelin
How to split column of vectors into two columns?
Mar 25, 2022
apache-spark
pyspark
apache-spark-ml
Running steps of EMR in parallel
Oct 15, 2022
web-services
amazon-web-services
apache-spark
amazon-emr
How Spark handle data larger than cluster memory
Mar 08, 2022
apache-spark
Dropping nested column of Dataframe with PySpark
Jul 13, 2022
apache-spark
dataframe
pyspark
struct
schema
Best practice to create SparkSession object in Scala to use both in unittest and spark-submit
Aug 31, 2022
scala
apache-spark
spark-submit
Add months to date column in Spark dataframe
Nov 06, 2022
python
apache-spark
pyspark
apache-spark-sql
What does "pre-built for Apache Hadoop 2.7 and later" mean?
Oct 29, 2022
apache-spark
« Newer Entries
Older Entries »