How merge 3 DataFrame in Spark-Scala? I completly don't have any Idea how I can make this. On stackOverFlow I can't found similar example.
I have 3 similar DataFrames. The same name of Column, and the same number of them. Difference is only a value on rows.
+----+------+----+---+
|type| Model|Name|ID |
+----+------+----+---+
| 1 |wdasd |xyzd|111|
| 1 |wd |zdfd|112|
| 1 |bdp |2gfs|113|
+----+------+----+---+
+----+------+----+---+
|type| Model|Name|ID |
+----+------+----+---+
| 2 |wdasd |xyzd|221|
| 2 |wd |zdfd|222|
| 2 |bdp |2gfs|223|
+----+------+----+---+
+----+------+----+---+
|type| Model|Name|ID |
+----+------+----+---+
| 3 |AAAA |N_AM|331|
| 3 |BBBB |NA_M|332|
| 3 |CCCC |MA_N|333|
+----+------+----+---+
And I want to this type of DataFrame
+----+------+----+---+
|type| Model|Name|ID |
+----+------+----+---+
| 1 |wdasd |xyzd|111|
| 1 |wd |zdfd|112|
| 1 |bdp |2gfs|113|
| 2 |wdasd |xyzd|221|
| 2 |wd |zdfd|222|
| 2 |bdp |2gfs|223|
| 3 |AAAA |N_AM|331|
| 3 |BBBB |NA_M|332|
| 3 |CCCC |MA_N|333|
+----+------+----+---+
Spark provides a union and unionAll. Looks like they are deprecating the unionAll function so I would use the union function as below:
dataFrame1.union(dataFrame2).union(dataFrame3)
Note that in order to union data frames the data frames must have the exact same column names in the exact same order.
See the spark docs here
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With