Questions
Linux
Laravel
Mysql
Ubuntu
Git
Menu
HTML
CSS
JAVASCRIPT
SQL
PYTHON
PHP
BOOTSTRAP
JAVA
JQUERY
R
React
Kotlin
×
Linux
Laravel
Mysql
Ubuntu
Git
New posts in apache-spark
SPARK : failure: ``union'' expected but `(' found
Jun 24, 2021
sql
scala
apache-spark
dataframe
apache-spark-sql
How to convert a JSON file to parquet using Apache Spark?
Oct 21, 2022
json
apache-spark
apache-spark-sql
parquet
Spark CrossValidatorModel access other models than the bestModel?
Apr 03, 2022
apache-spark
apache-spark-mllib
cross-validation
apache-spark-1.6
Emit multiple pairs in map operation
Dec 21, 2019
apache-spark
pyspark
Which is efficient, Dataframe or RDD or hiveql?
Aug 24, 2022
apache-spark
apache-spark-sql
spark-dataframe
Error ExecutorLostFailure when running a task in Spark
Aug 28, 2022
apache-spark
pyspark
apache-spark-mllib
collect
Spark Scala Understanding reduceByKey(_ + _)
Oct 14, 2022
scala
apache-spark
word-count
bigdata
Spark Standalone Number Executors/Cores Control
Nov 10, 2022
apache-spark
apache-spark-standalone
Missing SPARK_HOME when using SparkLauncher on AWS EMR cluster
Aug 12, 2017
amazon-web-services
apache-spark
pyspark
emr
amazon-emr
Scalatest and Spark giving "java.io.NotSerializableException: org.scalatest.Assertions$AssertionsHelper"
Mar 09, 2021
scala
apache-spark
serialization
rdd
scalatest
How to skip lines while reading a CSV file as a dataFrame using PySpark?
Apr 23, 2022
apache-spark
pyspark
spark-dataframe
pyspark-sql
How to process a range of hbase rows using spark?
Apr 01, 2022
java
hadoop
bigdata
apache-spark
How to process multi line input records in Spark
Nov 08, 2022
scala
apache-spark
Hive doesn't read partitioned parquet files generated by Spark
Aug 21, 2022
apache-spark
hive
partitioning
partition
parquet
Kafka Producer - org.apache.kafka.common.serialization.StringSerializer could not be found
Sep 15, 2022
apache-spark
apache-kafka
apache-karaf
spark-streaming-kafka
Graphx Visualization
Apr 19, 2022
apache-spark
visualization
spark-graphx
reading json file in pyspark
Oct 21, 2022
apache-spark
pyspark
spark-streaming
how can i add a timestamp as an extra column to my dataframe
Nov 10, 2022
apache-spark
spark-dataframe
immutability
rdd
Saving contents of df.show() as a string in spark-scala app
Jan 07, 2020
scala
apache-spark
log4j
If dataframes in Spark are immutable, why are we able to modify it with operations such as withColumn()?
Nov 04, 2022
apache-spark
pyspark
« Newer Entries
Older Entries »