Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in apache-spark

org.apache.spark.sql.AnalysisException: cannot resolve given input columns

How do I increase decimal precision in Spark?

Spark Mongodb Connector Scala - Missing database name

Vector assembler in Pyspark is creating tuple of multiple vectors instead of a single vector, how to solve the issue? [duplicate]

UDF with multiple rows as response pySpark

apache-spark pyspark

Custom Evaluator in PySpark

Check if table exists in hive metastore using Pyspark

How does Apache Spark handles system failure when deployed in YARN?

Apache Spark or Cascading framework? [closed]

java apache-spark cascading

How to get pass "requires authentication" while connecting to remote Cassandra cluster using SparkConf?

Functions from Python packages for udf() of Spark dataframe

python apache-spark pyspark

Spark JSON text field to RDD

Spark: scala.MatchError (of class org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema

Getting NullPointerException using spark-csv with DataFrames

Does a flatMap in spark cause a shuffle?

scala apache-spark bigdata

How to use Spark's repartitionAndSortWithinPartitions?

scala apache-spark

Select array element from Spark Dataframes split method in same call?

Running yarn with spark not working with Java 8

How to read in-memory JSON string into Spark DataFrame

Why is the number of partitions after groupBy 200? Why is this 200 not some other number?

apache-spark