Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Am I crazy: PostgreSQL IN operator with nested query returning unexpected results

The following query returns 2036 rows:

SELECT "FooUID" from "Foo" f
LEFT JOIN "Bar" b ON f."BarUID" = b."BarUID"
WHERE f."BarUID" IS NOT NULL AND b."BarUID" IS NULL

But the following statement only updated 1870 rows:

UPDATE "Foo" f1 set "BarUID" = 'aNewUID'
WHERE f1."FooUID" IN (
   SELECT f2."FooUID" from "Foo" f2
   LEFT JOIN "Bar" b ON f2."BarUID" = b."BarUID"
   WHERE f2."BarUID" IS NOT NULL AND b."BarUID" IS NULL
)

How is this possible?

EDIT 1: The first query continues to return 166 rows, and the second continues to update 0 rows.

EDIT 2:

In the following, the nested query returns a row containing a UID, but the outer query returns 0 rows.

SELECT * from "Foo" f1
WHERE f1."FooUID" = (
   SELECT f2."FooUID" FROM "Foo" f2
   LEFT JOIN "Bar" b ON f2."BarUID" = b."BarUID"
   WHERE f2."BarUID" IS NOT NULL AND b."BarUID" IS NULL
   LIMIT 1
)

Am I crazy?

EDIT 3:

The following statement, provided by @wildplasser succeeded in updating the remaining 166 rows:

UPDATE "Foo" ff
SET "BarUID" = 'aNewUID'
WHERE ff."BarUID" IS NOT NULL
AND NOT EXISTS (
   SELECT * FROM "Bar" bb
   WHERE bb."BarUID"= ff."BarUID"
)

However, I still don't understand why the original didn't pick them up. If the nested query selected 166 "FooUID"s, why would they not be matched to rows in the "Foo" table using IN?

EDIT 4: The more I think about it, this background might be important:

This all took place on a database server that was recently cloned from another one. I spoke to the IT guy who did the cloning, and it turns out he didn't shut down an application running on top of the original DB before bringing it down to clone it. This means the DB was mostly likely brought down mid-transaction (I don't know how ungracefully). Is it possible something in the database was left in a corrupted state, leading me to see these phantom rows?

Unfortunately I can no longer repro it, since running wildplasser's fix. The original DB (up and serving the application again) has none of the invalid data I was trying to fix on the copy, much less any trace of the shenanigans I witnessed.

I should mention that before running the fix, I reduced the issue to the most basic absurdity: I first selected the FooUID from the nested query in Edit 2, copied it to the clipboard, then ran a query selecting from Foo where FooUID equaled the pasted value - this still returned 0 rows.

like image 835
Paul Bellora Avatar asked Sep 11 '11 20:09

Paul Bellora


1 Answers

What happens if you rewrite this with NOT EXIST, like

UPDATE Foo ff
SET baruid = 'aNewUID'
WHERE ff.baruid IS NOT NULL
AND NOT EXISTS (SELECT * FROM bar bb
    WHERE bb.baruid = ff.baruid
    );

Looks much cleaner to me than selecting the limb leg of an outer join.

like image 160
wildplasser Avatar answered Nov 11 '22 11:11

wildplasser