Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in apache-spark-2.0

Livy Server: return a dataframe as JSON?

SparkSession initialization error - Unable to use spark.read

Reading Avro messages from Kafka with Spark 2.0.2 (structured streaming)

Spark 2.0.0 Error: PartitioningCollection requires all of its partitionings have the same numPartitions

Avoid starting HiveThriftServer2 with created context programmatically

Pass system property to spark-submit and read file from classpath or custom path

How to convert RDD of dense vector into DataFrame in pyspark?

Apache Spark vs Apache Spark 2 [closed]

Why does using cache on streaming Datasets fail with "AnalysisException: Queries with streaming sources must be executed with writeStream.start()"?

Spark fails to start in local mode when disconnected [Possible bug in handling IPv6 in Spark??]

Timeout Exception in Apache-Spark during program Execution

dynamically bind variable/parameter in Spark SQL?

spark off heap memory config and tungsten

Spark 2.0 Dataset vs DataFrame

How to create SparkSession from existing SparkContext

Reading csv files with quoted fields containing embedded commas

Spark parquet partitioning : Large number of files

What are the various join types in Spark?