I did an algorithm and I got a lot of columns with the name logic and number suffix, I need to do coalesce
but I don't know how to apply coalesce
with different amount of columns.
Example:
|id|logic_01|logic_02|logic_03|
|1 | null |a |null |
|2 | null |b |null |
|3 | c | null |null |
|4 | null | null |d |
Response:
|id|logic|
|1 | a |
|2 | b |
|3 | c |
|4 | d |
Another example:
|id|logic_01|logic_02|logic_03|logic_04|
|1 | null |a |null |null |
|2 | null | null |null |b |
|3 | c | null |null |null |
|4 | null | null |d |null |
Response:
|id|logic|
|1 | a |
|2 | b |
|3 | c |
|4 | d |
Thanks for your help.
First find all columns that you want to use in the coalesce
:
val cols = df.columns.filter(_.startsWith("logic")).map(col(_))
Then perform the actual coalesce
:
df.select($"id", coalesce(cols: _*).as("logic"))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With