Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in apache-spark

Spark structured streaming kafka convert JSON without schema (infer schema)

Class com.hadoop.compression.lzo.LzoCodec not found for Spark on CDH 5?

Specifying an external configuration file for Apache Spark

PySpark 1.5 How to Truncate Timestamp to Nearest Minute from seconds

Spark 1.6-Failed to locate the winutils binary in the hadoop binary path

java hadoop apache-spark

Spark - Random Number Generation

Could not bind on a random free port error while trying to connect to spark master

EntityTooLarge error when uploading a 5G file to Amazon S3

How to get ID of a map task in Spark?

pyspark matrix with dummy variables

python apache-spark pyspark

Spark column string replace when present in other column (row)

Converting a Spark Dataframe to a Scala Map collection

How to change the column type from String to Date in DataFrames?

Remove rows from dataframe based on condition in pyspark

Matrix Transpose on RowMatrix in Spark

apache-spark

PySpark computing correlation

How to update column based on a condition (a value in a group)?

AuthorizationException: User not allowed to impersonate User

How to CROSS JOIN 2 dataframe?

Installing Apache Spark on Ubuntu 14.04