Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Update table in rows which are partial duplicates

Tags:

sql

postgresql

My dataset consists of daily (actually business days) timeseries for different companies from different industries and I work with PostgreSQL. I have an indicator variable in my dataset taking values 1, -1 and most of the times 0. For better readability of the question I refer to the case where the indicator variable is unequal to zero for a given company as indicator event.

If there is more than one indicator event for a given industry on a given day, the indicator variables of the respecting companies shall be updated to 0.

We can think of the following example dataset:

day              company     indicator     industry
2012-01-12       A           1             financial
2012-01-12       B           1             consumer
2012-01-12       C           0             consumer
2012-01-13       A           0             financial
2012-01-13       B           1             consumer
2012-01-13       C           0             consumer
2012-01-16       A           1             financial
2012-01-16       B           -1            consumer
2012-01-16       C           1             consumer

So the indicator values that shall be updated to zero are on 2012-01-16 the entries for companies B and C because they both come from the same industry and experienced the indicator event on the same day.

My idea was to work with the exists operator:

    update mytable t1 set indicator = 0
    where exists (
              select 1
              from mytable t2
              where t2.day = t1.day
              and t2.industry = t1.industry
              and t2.indicator <> 0
              and t1.indicator <> 0)

But somehow that updated all indicator values to 0, and I cannot figure out why.

Do you have any ideas how to fix this, or how to solve my problem with another approach?

like image 510
user3319629 Avatar asked Mar 01 '26 19:03

user3319629


1 Answers

You might want to add a condition to not join a row to itself (which will always be true), e.g.

update mytable t1 set indicator = 0
where exists (
          select 1
          from mytable t2
          where t2.day = t1.day
          and t1.company <> t2.company
          and t2.industry = t1.industry
          and t2.indicator <> 0
          and t1.indicator <> 0)
like image 171
erikxiv Avatar answered Mar 04 '26 10:03

erikxiv



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!