Is there a way to get the dataframe that union dataframe in loop?
This is a sample code:
var fruits = List(
"apple"
,"orange"
,"melon"
)
for (x <- fruits){
var df = Seq(("aaa","bbb",x)).toDF("aCol","bCol","name")
}
I would want to obtain some like this:
aCol | bCol | fruitsName
aaa,bbb,apple
aaa,bbb,orange
aaa,bbb,melon
Thanks again
If you add "ID" into your window w as another partitionBy argument, you do not need to do the for loop and union at all. Just subset the dataframe into the ids you want test_df = test_df. where(col("ID"). isin(series_list)) and you are good to go.
UNION and UNION ALL return the rows that are found in either relation. UNION (alternatively, UNION DISTINCT ) takes only distinct rows while UNION ALL does not remove duplicates from the result rows.
You could created a sequence of DataFrame
s and then use reduce
:
val results = fruits.
map(fruit => Seq(("aaa", "bbb", fruit)).toDF("aCol","bCol","name")).
reduce(_.union(_))
results.show()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With