Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to add a unique id column to a DataFrame, Apache Spark, Scala

I have a DataFrame, that i want to join with another Dataframe, and then group by original rows, but the original rows do not have a unique id. How can i add a unique id or otherwise accomplish that goal.

like image 663
qonf Avatar asked Nov 16 '25 01:11

qonf


1 Answers

You can use monotonically_increasing_id

import org.apache.spark.sql.functions._
val unique_df = original_df.withColumn("UniqueID", monotonically_increasing_id)
like image 178
Tawkir Avatar answered Nov 17 '25 20:11

Tawkir



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!