suppose I have following sql table
objid firstname lastname active 1 test test 0 2 test test 1 3 test1 test1 1 4 test2 test2 0 5 test2 test2 0 6 test3 test3 1
Now, the result I am interested in is as follows:
objid firstname lastname active 1 test test 0 2 test test 1 4 test2 test2 0 5 test2 test2 0
How can I achieve this? I have tried the following query,
select firstname,lastname from table group by firstname,lastname having count(*) > 1
But this query gives results like
firstname lastname test test test2 test2
To select duplicate values, you need to create groups of rows with the same values and then select the groups with counts greater than one. You can achieve that by using GROUP BY and a HAVING clause.
The pandas. DataFrame. duplicated() method is used to find duplicate rows in a DataFrame. It returns a boolean series which identifies whether a row is duplicate or unique.
You can find duplicates by grouping rows, using the COUNT aggregate function, and specifying a HAVING clause with which to filter rows.
Duplicate rows means, having multiple rows on all columns. Using this method you can get duplicate rows on selected multiple columns or all columns. In this article, I will explain these with several examples.
How to find and highlight duplicate rows in a range in Excel? 1 Find duplicate rows in a range with formula. ... 2 Highlight duplicate rows in a range with Conditional Formatting. ... 3 Select or highlight duplicate rows in a range with Kutools for Excel. ... 4 Demo: Find and highlight duplicate or unique rows in a range with Kutools for Excel. ...
Our DataFrame contains column names Courses, Fee, Duration, and Discount. Yields below output. 2. Select Duplicate Rows Based on All Columns You can use df [df.duplicated ()] without any arguments to get rows with the same values on all columns. It takes defaults values subset=None and keep=‘first’.
You've found your duplicated records but you're interested in getting all the information attached to them. You need to join
your duplicates to your main table to get that information.
select * from my_table a join ( select firstname, lastname from my_table group by firstname, lastname having count(*) > 1 ) b on a.firstname = b.firstname and a.lastname = b.lastname
This is the same as an inner join
and means that for every record in your sub-query, that found the duplicate records you find everything from your main table that has the same firstseen and lastseen combination.
You can also do this with in, though you should test the difference:
select * from my_table a where ( firstname, lastname ) in ( select firstname, lastname from my_table group by firstname, lastname having count(*) > 1 )
SELECT DISTINCT t1.* FROM myTable AS t1 INNER JOIN myTable AS t2 ON t1.firstname = t2.firstname AND t1.lastname = t2.lastname AND t1.objid <> t2.objid
This will output every row which has a duplicate, basing on firstname
and lastname
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With