I have two data frame like below
+--------------------+--------+-----------+-------------+
|UniqueFundamentalSet|Taxonomy|FFAction|!||DataPartition|
+--------------------+--------+-----------+-------------+
|192730241374 |1 |I|!| |Japan |
|192730241374 |2 |I|!| |Japan |
|192730241373 |1 |I|!| |Japan |
|192730241373 |2 |I|!| |Japan |
+--------------------+--------+-----------+-------------+
+--------------------+--------+-----------+-------------+
|UniqueFundamentalSet|Taxonomy|FFAction|!||DataPartition|
+--------------------+--------+-----------+-------------+
|192730241374 |1 |I|!| |Japan |
|192730241374 |2 |I|!| |Japan |
|192730391384 |1 |I|!| |Japan |
|192730391384 |2 |I|!| |Japan |
|192730241373 |1 |I|!| |Japan |
|192730241373 |2 |I|!| |Japan |
+--------------------+--------+-----------+-------------+
When i perform union between above data frame i get duplicate rows . Here is my output
+--------------------+--------+-----------+-------------+
|UniqueFundamentalSet|Taxonomy|FFAction|!||DataPartition|
+--------------------+--------+-----------+-------------+
|192730241374 |1 |I|!| |Japan |
|192730241374 |2 |I|!| |Japan |
|192730241373 |1 |I|!| |Japan |
|192730241373 |2 |I|!| |Japan |
|192730241374 |1 |I|!| |Japan |
|192730241374 |2 |I|!| |Japan |
|192730391384 |1 |I|!| |Japan |
|192730391384 |2 |I|!| |Japan |
|192730241373 |1 |I|!| |Japan |
|192730241373 |2 |I|!| |Japan |
+--------------------+--------+-----------+-------------+
val dfToSave = dfMainOutput.union(insertdf)
I was in a impression that union removes duplicate rows and unionall keeps it. I have to use distinct after union . Can some one please explain this .
Your impression was wrong. As stated in the official documentation:
Returns a new Dataset containing union of rows in this Dataset and another Dataset>.
This is equivalent to UNION ALL in SQL. To do a SQL-style set union (that does deduplication of elements), use this function followed by a
distinct
.Also as standard in SQL, this function resolves columns by position (not by name):
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With