Finding Duplicate Orders (by time proximity)

Tags:

sql-server

I have a table of orders that I know have duplicates

    customer   order_number   order_date
   ----------  ------------   -------------------
          1             1     2012-03-01 01:58:00
          1             2     2012-03-01 02:01:00
          1             3     2012-03-01 02:03:00
          2             4     2012-03-01 02:15:00
          3             5     2012-03-01 02:18:00
          3             6     2012-03-01 04:30:00
          4             7     2012-03-01 04:35:00
          5             8     2012-03-01 04:38:00
          6             9     2012-03-01 04:58:00
          6            10     2012-03-01 04:59:00

I want to find all duplicates (order by same customer within 60 minutes of eachother). Either a resultset consisting of the 'duplicate' rows or a set of all customers with a count of how many duplicates.

Here is what I have tried

Click to copy

SELECT
   customer,
   count(*)
FROM
   orders
GROUP BY
   customer,
   DATEPART(HOUR, order_date)
HAVING (count(*) > 1)

This doesn't work when duplicates are within 60 minutes of each other but are in different hours i.e 1:58 and 2:02

I've also tried this

Click to copy

SELECT
  o1.customer,
  o1.order_number,
  o2.order_number,
  DATEDIFF(MINUTE,o1.order_date, o2.order_date) AS [diff]
FROM
  orders o1 LEFT OUTER JOIN
  orders o2 ON o1.customer = o2.customer AND o1.order_number <> o2.order_number
WHERE
  ABS(DATEDIFF(MINUTE,o1.order_date, o2.order_date)) < 60

Now this gives me all of the duplicates but it also gives me multiple rows per duplicate order. i.e (o1, o2) and (o2, o1) which wouldn't be so bad if there were'nt some orders with multiple duplicates. In those cases I get (o1, o2), (o1,o3), (o2, o1), (o2, o3), (o3, o1), (o3, o2) etc. I get all of the permutations.

Anyone have some insight? I'm not necessarily looking for the best performing answer here, just one that works.

369

asked Mar 02 '12 15:03

Ben English

1 Answers

Click to copy

SELECT
  *,
  CASE WHEN EXISTS (SELECT *
                      FROM orders AS lookup
                     WHERE customer    = orders.customer
                       AND order_date <  orders.order_date
                       AND order_date >= DATEADD(hour, -1, order_date)
                   )
       THEN 'Principle Order'
       ELSE 'Duplicate Order'
  END as Order_Status
FROM
  orders

Using EXISTS and a correlated sub-query you can check if there were any preceding orders in the last hour.

117

answered Sep 28 '22 17:09

MatBailie

Related questions
                            
                                SQL recursive query
                            
                                How to append data from SQL to an existing file
                            
                                How to create a simple Local SQL database & insert values into it using C#?
                            
                                DBI: disconnect - question
                            
                                Linq to NHibernate generating multiple joins to the same table
                            
                                Why creating Tables in run-time (code behind) is bad?
                            
                                GROUP BY with aggregate and an INNER JOIN
                            
                                Hibernate - Use native query and alias to Bean with enum properties?
                            
                                Using PLY to parse SQL statements
                            
                                Datatype/structure to store timezone offset in MySQL
                            
                                Is there anything like Parallel CURSOR?
                            
                                SQL query to search by multiple tags with relevance sorting
                            
                                Best practices for ordering columns in SQL when creating the table
                            
                                Is the connectionString within the project safe when deploying to the cloud?
                            
                                Recommended approach to insert many rows with Castle ActiveRecord and ignore any dupes
                            
                                How to identify high-load SQL in Oracle, using oracle views?
                            
                                Postgres - Is this the right way to create a partial index on a boolean column?
                            
                                Simple subquery in Access
                            
                                Recurring Events, SQL Query
                            
                                DBMS_XPLAN.DISPLAY_CURSOR vs Explain Plan if not using gather_plan_statistics hint

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Finding Duplicate Orders (by time proximity)

Tags:

sql

sql-server

Ben English

People also ask

1 Answers

MatBailie

Recent Activity

Donate For Us