I havce two data frames : df1 and df2
df1
|--- id---|---value---|
|    1    |    23     |
|    2    |    23     |
|    3    |    23     |
|    2    |    25     |
|    5    |    25     |
df2
|-idValue-|---count---|
|    1    |    33     |
|    2    |    23     |
|    3    |    34     |
|    13   |    34     |
|    23   |    34     |
How do I get this ?
|--- id--------|---value---|---count---|
|    1         |    23     |    33     |
|    2         |    23     |    23     |
|    3         |    23     |    34     |
|    2         |    25     |    23     |
|    5         |    25     |    null   |
I am doing :
 val groupedData =  df1.join(df2, $"id" === $"idValue", "outer") 
But I don't see the last column in the groupedData. Is this correct way of doing ? Or Am I doing any thing wrong ?
From your expected output, you need LEFT OUTER JOIN.
val groupedData =  df1.join(df2, $"id" === $"idValue", "left_outer").
       select(df1("id"), df1("count"), df2("count")).
       take(10).foreach(println)
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With