Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in apache-spark

How to Setup SPARK_HOME variable?

How to load CSV file with records on multiple lines?

Creating a simple 1-row Spark DataFrame with Java API

How to use LEFT and RIGHT keyword in SPARK SQL

Filtering rows with empty arrays in PySpark

Spark read s3 using sc.textFile("s3a://bucket/filePath"). java.lang.NoSuchMethodError: com.amazonaws.services.s3.transfer.TransferManager

apache-spark amazon-s3

DataFrame columns names conflict with .(dot)

How to make it easier to deploy my Jar to Spark Cluster in standalone mode?

jar apache-spark

Spark : How to use mapPartition and create/close connection per partition

scala apache-spark rdd

Why does conf.set("spark.app.name", appName) not set the name in the UI?

apache-spark

spark - scala: not a member of org.apache.spark.sql.Row

calculating percentages on a pyspark dataframe

SparkSQL and explode on DataFrame in Java

Pyspark dataframe how to drop rows with nulls in all columns?

Spark Select with a List of Columns Scala

scala apache-spark

How to overwrite Spark ML model in PySpark?

Pyspark AWS credentials

How to get nth row of Spark RDD?

hadoop apache-spark rdd

Removing punctuation marks form text in Scala - Spark

Add a new column to a Dataframe. New column i want it to be a UUID generator