Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in apache-spark

Project_Bank.csv is not a Parquet file. expected magic number at tail [80, 65, 82, 49] but found [110, 111, 13, 10]

Is there any way to get the output of Spark's Dataset.show() method as a string?

How to pivot streaming dataset?

UDF cause warning: CachedKafkaConsumer is not running in UninterruptibleThread (KAFKA-1894)

How can I force spark/hadoop to ignore the .gz extension on a file and read it as uncompressed plain text?

scala hadoop apache-spark gzip

pyspark equivalence of `df.loc`?

Calling a rest service from Spark

scala apache-spark rest

Does Spark support BigInteger type?

Failed to execute user defined function($anonfun$9: (string) => double) on using String Indexer for multiple columns

Spark: Prevent shuffle/exchange when joining two identically partitioned dataframes

How to set hive.metastore.warehouse.dir in HiveContext?

Spark SQL grouping: Add to group by or wrap in first() if you don't care which value you get.;

sql group-by apache-spark udf

How to extract rules from decision tree spark MLlib

Custom log4j appender in spark executor

apache-spark log4j

Uncaught Exception Handling in Spark

Why can I not read from the AWS S3 in Spark application anymore?

java amazon-s3 apache-spark

Spark Worker node stops automatically

java apache-spark

Resolving "Kryo serialization failed: Buffer overflow" Spark exception

apache-spark kryo

How to compute the distance matrix in spark?

Spark-submit master url and SparkSession master url in the main class, what is difference?

apache-spark