Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in apache-spark-dataset

How to unpack multiple keys in a Spark DataSet

How to use approxQuantile by group?

Scala spark: how to use dataset for a case class with the schema has snake_case?

Spark StringIndexer.fit is very slow on large records

Can Spark read data directly into a nested case class?

Should cache and checkpoint be used together on DataSets? If so, how does this work under the hood?

How to drop malformed rows while reading csv with schema Spark?

Convert scala list to DataFrame or DataSet

When to use Spark DataFrame/Dataset API and when to use plain RDD?

Spark 2.0 implicit encoder, deal with missing column when type is Option[Seq[String]] (scala)

What is the difference between Spark DataSet and RDD

Spark 2 Dataset Null value exception

Create Spark Dataset from a CSV file

How to lower the case of column names of a data frame but not its values?

How to convert the datasets of Spark Row into string?

Why do columns change to nullable in Apache Spark SQL?

How to read ".gz" compressed file using spark DF or DS?