Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in apache-spark

EMR 5.x | Spark on Yarn | Exit code 137 and Java heap space Error

Spark dataframe select rows with at least one null or blank in any column of that row

scala apache-spark

Generic T as Spark Dataset[T] constructor

Spark UDAF with ArrayType as bufferSchema performance issues

How to use AWS Glue / Spark to convert CSVs partitioned and split in S3 to partitioned and split Parquet

How to extract all elements from array of structs?

How to check if key exists in spark sql map type

Spark Dataframe: Select distinct rows

Why "databricks-connect test" does not work after configurate Databricks Connect?

Which Scala version does Spark 2.4.3 uses?

apache-spark

having Spark process partitions concurrently, using a single dev/test machine

scala apache-spark

Provider org.apache.spark.sql.avro.AvroFileFormat could not be instantiated

spark delta overwrite a specific partition

apache-spark delta-lake

How to create date from year, month and day in PySpark?

Scala with spark - "javax.servlet.ServletRegistration"'s signer information does not match signer information of other classes in the same package

sql scala apache-spark

How to work with Apache Spark using Intellij Idea?

Hive tables not found when running in YARN-Cluster mode

Pyspark RDD collect first 163 Rows

install spark packages in toree

spark collect as Array[T] and not as Array[Row] from data frame