Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark dataframe add a row for every existing row

I have a dataframe with following columns:

groupid,unit,height
----------------------
1,in,55
2,in,54

I want to create another dataframe with additional rows where unit=cm and height=height*2.54.

Resulting dataframe:

groupid,unit,height
----------------------
1,in,55
2,in,54
1,cm,139.7
2,cm,137.16

Not sure how I can use spark udf and explode here. Any help is appreciated. Thanks in advance.

like image 920
dreddy Avatar asked Jul 10 '17 03:07

dreddy


People also ask

How do I append rows in PySpark DataFrame?

Here we create an empty DataFrame where data is to be added, then we convert the data to be added into a Spark DataFrame using createDataFrame() and further convert both DataFrames to a Pandas DataFrame using toPandas() and use the append() function to add the non-empty data frame to the empty DataFrame and ignore the ...

How do I add row numbers in Spark?

The row_number() is a window function in Spark SQL that assigns a row number (sequential integer number) to each row in the result DataFrame. This function is used with Window. partitionBy() which partitions the data into windows frames and orderBy() clause to sort the rows in each partition.

How does createOrReplaceTempView work in Spark?

The createOrReplaceTempView() is used to create a temporary view/table from the Spark DataFrame or Dataset objects. Since it is a temporary view, the lifetime of the table/view is tied to the current SparkSession. Hence, It will be automatically removed when your spark session ends.


1 Answers

you can create another dataframe with changes you require using withColumn and then union both dataframes as

import sqlContext.implicits._
import org.apache.spark.sql.functions._

val df = Seq(
  (1, "in", 55),
  (2, "in", 54)
).toDF("groupid", "unit", "height")

val df2 = df.withColumn("unit", lit("cm")).withColumn("height", col("height")*2.54)

df.union(df2).show(false)

you should have

+-------+----+------+
|groupid|unit|height|
+-------+----+------+
|1      |in  |55.0  |
|2      |in  |54.0  |
|1      |cm  |139.7 |
|2      |cm  |137.16|
+-------+----+------+
like image 111
Ramesh Maharjan Avatar answered Sep 28 '22 08:09

Ramesh Maharjan