Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compare 2 files and keep entries that are not common to both

Tags:

talend

I have a CSV file and a database table as inputs. I need to keep all the records that are in the database table but exclude those that are in the CSV file. It all gets written to a CSV file in the end.

All my attempts have resulted in hundreds of duplicates so I'm rather challenged here.

like image 348
Ree Avatar asked Sep 07 '25 10:09

Ree


1 Answers

This is a quick-and-dirty trick, but it should fit your needs.

Connect your database and your CVS input to a TMap- Use your DB as main connection and CSV as Lookup. Something like:

tMySqlInput---->Main--->tMap---out---->tFileDelimitedOutput
                         ^
                         |
                         |
                       Lookup
                         |
                         |
                  tFileDelimitedInput

Inside the tMap, map the join columns as usual. Click on the wrench just above the lookup mapping area and set the join type as Inner join. Be sure that "Match Model" is not set to All, to avoid cross joins. Then, create the "out" output connection. Finally, click the wrench of this connection and set "Catch lookup inner join reject" to true

On "out" (so, the output file) you will have only the unmatched rows from the DB. Depending on your data, you may need to add a tAggregateRow just before tFileDelimitedOutput to eventually clean duplicates.

like image 132
Gabriele B Avatar answered Sep 10 '25 07:09

Gabriele B