Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Deleting repeating values in database

I have a MySql table which is populating daily with price values. Each day it records an entry even if the price hasn't changed. I want to delete some rows that repeat too much. I want to keep the first price and the last price before there is a price change.

Example 1)

   id name     price date
    1 Product1 $6 13/07/2017
    2 Product1 $6 14/07/2017
    3 Product1 $6 15/07/2017
    4 Product1 $7 16/07/2017
    5 Product1 $6 17/07/2017
    6 Product1 $6 18/07/2017
    7 Product1 $6 19/07/2017

From that list records with id 2 and 6 should be deleted with the following result:

   id name     price date
    1 Product1 $6 13/07/2017
    3 Product1 $6 15/07/2017
    4 Product1 $7 16/07/2017
    5 Product1 $6 17/07/2017
    7 Product1 $6 19/07/2017

Example 2)

   id name     price date
    1 Product1 $6 13/07/2017
    2 Product1 $6 14/07/2017
    3 Product1 $6 15/07/2017
    4 Product1 $6 16/07/2017
    5 Product1 $6 17/07/2017
    6 Product1 $6 18/07/2017
    7 Product1 $6 19/07/2017

Here there is no price change so I can delete all records from 2 to 6:

   id name     price date
    1 Product1 $6 13/07/2017
    7 Product1 $6 19/07/2017

Id is not supposed to be one incremental and also date is not updated on daily basis.

like image 945
Ahsan Mukhtar Avatar asked Aug 16 '17 02:08

Ahsan Mukhtar


People also ask

How do you delete repeated values?

In Excel, there are several ways to filter for unique values—or remove duplicate values: To filter for unique values, click Data > Sort & Filter > Advanced. To remove duplicate values, click Data > Data Tools > Remove Duplicates.

How do you delete duplicate records by group in SQL?

SQL Delete Duplicate Rows using Group By and Having Clause According to Delete Duplicate Rows in SQL, for finding duplicate rows, you need to use the SQL GROUP BY clause. The COUNT function can be used to verify the occurrence of a row using the Group by clause, which groups data according to the given columns.

How can I delete duplicate records in SQL using distinct?

Introduction to SQL DISTINCT operator Note that the DISTINCT only removes the duplicate rows from the result set. It doesn't delete duplicate rows in the table. If you want to select two columns and remove duplicates in one column, you should use the GROUP BY clause instead.


Video Answer


2 Answers

You can do this with some creative self-join logic.

Think of three hypothetical rows in the table.

  • Row a you want to keep.
  • Row b has the same product name and price, and a date 1 day after a. You want to delete this.
  • Row c has the same product name and price, and a date 1 day after b. You want to keep this.

So if you can do a self-join to match these three rows, then delete row b.

DELETE b FROM MyTable AS a 
JOIN MyTable AS b ON a.name=b.name AND a.price=b.price AND a.date=b.date + INTERVAL 1 DAY 
JOIN MyTable AS c ON b.name=c.name AND b.price=c.price AND b.date=c.date + INTERVAL 1 DAY;

This works even if there are multiple rows that fit the conditions for row b. It'll delete the first one, and then continue to delete subsequent rows that also fit the conditions.

This works if you use the DATE data type and store your dates as 'YYYY-MM-DD', not 'DD-MM-YYYY'. You should do this anyway.

like image 149
Bill Karwin Avatar answered Oct 17 '22 20:10

Bill Karwin


You want to delete the rows where the product name and price are the same as the rows with the date plus/minus one day.

DELETE row_mid
FROM 
  record_table AS row_mid
  JOIN record_table AS row_prev
  JOIN record_table AS row_next
WHERE
  row_mid.name = row_prev.name 
  AND row_mid.price = row_prev.price
  AND row_mid.date = DATE_SUB(row_prev.date, INTERVAL 1 DAY)
  AND row_mid.name = row_next.name
  AND row_mid.price = row_next.price
  AND row_mid.date = DATE_ADD(row_next.date, INTERVAL 1 DAY);
like image 39
Scott Newson Avatar answered Oct 17 '22 20:10

Scott Newson