Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in apache-spark

Custom log4j appender in spark executor

apache-spark log4j

Uncaught Exception Handling in Spark

Why can I not read from the AWS S3 in Spark application anymore?

java amazon-s3 apache-spark

Spark Worker node stops automatically

java apache-spark

Resolving "Kryo serialization failed: Buffer overflow" Spark exception

apache-spark kryo

How to compute the distance matrix in spark?

Spark-submit master url and SparkSession master url in the main class, what is difference?

apache-spark

null value and countDistinct with spark dataframe

How does Apache Spark send functions to other machines under the hood

spark on yarn, Connecting to ResourceManager at /0.0.0.0:8032

How to setup Spark with a multi node Cassandra cluster?

How to stop spark structured streaming from listing all files in an S3 bucket every time

apache-spark amazon-s3

Spark job reading from S3 on Spark cluster gives IllegalAccessError: tried to access method MutableCounterLong [duplicate]

Is there a way to dynamically stop Spark Structured Streaming?

How to write TIMESTAMP logical type (INT96) to parquet, using ParquetWriter?

Spark Truncated Spark Plan

Spark createDataFrame(df.rdd, df.schema) vs checkPoint for breaking lineage

What is the difference between Driver and Application manager in spark

spark <console>:12: error: not found: value sc

Why are aggregate and fold two different APIs in Spark?