Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PL/SQL SELECT with multiple COUNT(DISTINCT xxx) - unexpected results

I'm trying to put together a query for an Oracle 11g application and I've run into a problem.

I'll simplify the real scenario to make it easier to understand (and also to protect the client's data):

  • Table A is the base table. It has a known identifier in it that I pass in to the query.
  • For each entry in Table A there may be multiple entries in Table B. Table B contains a value that I am interested in.
  • For each entry in Table B there may also be multiple entries in Table C. Table C contains another value I'm interested in.
  • I also have an XML snippet containing a list of values that may or may not match up to the values of interest in table C.
  • The query does an outer join to the XML so that if there is a matching value it will return the value again, otherwise it is null.

What I want to do is get back the identifier I passed in, a count of the unique values in B and C, as well as a count of the unique (and non-null) values from the XML part of the join.

My current query is:

SELECT
    a.ID
  , COUNT(DISTINCT b.VAL) AS B_VAL
  , COUNT(DISTINCT c.VAL) AS C_VAL
  , COUNT(DISTINCT xml.VAL) AS XML_VAL
FROM a, b, c,
  XMLTABLE('/field1/collection/value' passing my_xml_type
    COLUMNS VAL VARCHAR2(50) PATH '.') xml
WHERE
      a.ID = b.SOME_ID
  AND b.OTHER_ID = c.OTHER_ID
  AND c.VAL = xml.VAL (+)

Now if you forget about the counting and just return rows, an example result set might look something like this:

ID     B_VAL     C_VAL     XML_VAL
---------------------------------------
X      abc       123       123
X      abc       456       null
X      abc       789       789
X      abc       789       789

DESIRED: Now when I want to do the distinct counts, I'd like it to return:

ID     B_VAL     C_VAL     XML_VAL
---------------------------------------
X      1         3         2

ACTUAL: However, this is what I'm getting when I have them all as COUNT(DISTINCT ...):

ID     B_VAL     C_VAL     XML_VAL
---------------------------------------
X      1         1         1

ALTERNATIVE: ...and if I take the DISTINCT out of the counts then I get:

ID     B_VAL     C_VAL     XML_VAL
---------------------------------------
X      1         4         3

How come the DISTINCT seems to be operating only within a particular B_VAL, but taking it out causes it to operate across all the rows but not taking uniqueness into account?

Is there another way of doing this that doesn't involve having to replicate all the joins as a sub-query? Have I missed the point entirely?

(Please note, I'm not a DB developer at all, I've just been pulled in to help out, so sorry if this is an easy problem... I HAVE searched Google and browsed this site for answers before posting, though!)

Thanks.


I've found that if I take the XML table join out then the count distinct works OK across the B_VAL and C_VAL... So perhaps it's something weird with how Oracle handles XML table joins?

like image 459
ChrisC Avatar asked Feb 23 '23 23:02

ChrisC


2 Answers

As Vincent's test case works in 10.2.0.3 and 11.2.0.2, and if you're at an earlier version of 11g, this could be bug 8816675: XMLexists query returns wrong results with a select DISTINCT. The example in the bug is referring to a problem with count(distinct). You aren't explicitly using XMLexists, but the bug may have a wider impact then the title suggests, or it may be used under the hood.

If this is the problem, and you can't patch up, you might be able to work around it by wrapping the non-count version, which still isn't pretty:

SELECT
    A_ID
    , COUNT(DISTINCT B_VAL) AS B_VAL
    , COUNT(DISTINCT C_VAL) AS C_VAL
    , COUNT(DISTINCT XML_VAL) AS XML_VAL
FROM (
SELECT a.ID as A_ID, b.VAL as B_VAL, c.VAL as C_VAL, xml.VAL as XML_VAL
FROM a, b, c
    , XMLTABLE('/field1/collection/value' passing my_xml_type
        COLUMNS VAL VARCHAR2(50) PATH '.') xml
WHERE a.ID = b.SOME_ID
AND b.OTHER_ID = c.OTHER_ID
AND c.VAL = xml.VAL (+)
)
GROUP BY A_ID;
like image 142
Alex Poole Avatar answered Apr 06 '23 16:04

Alex Poole


I can't reproduce your finding with Oracle 10.2.0.3.

Here's my setup:

SQL> CREATE TABLE a AS SELECT 'X' ID FROM dual;

Table created

SQL> CREATE TABLE b AS SELECT 'abc' val, 'X' some_id, 1 other_id FROM dual;

Table created

SQL> CREATE TABLE c AS
  2     SELECT 1 other_id, '123' val,
  3            XMLTYPE('<field1>
  4                        <collection><value>123</value></collection>
  5                     </field1>') my_xml_type
  6       FROM dual UNION ALL
  7     SELECT 1 other_id, '456' val, NULL FROM dual UNION ALL
  8     SELECT 1 other_id, '789' val,
  9            XMLTYPE('<field1>
 10                        <collection><value>789</value></collection>
 11                        <collection><value>789</value></collection>
 12                     </field1>') my_xml_type
 13       FROM dual;

Table created

the query returns the right result:

SQL> SELECT
  2      a.ID
  3    , COUNT(DISTINCT b.VAL) AS B_VAL
  4    , COUNT(DISTINCT c.VAL) AS C_VAL
  5    , COUNT(DISTINCT xml.VAL) AS XML_VAL
  6  FROM a, b, c
  7     , XMLTABLE('/field1/collection/value' passing my_xml_type
  8                 COLUMNS VAL VARCHAR2(50) PATH '.') xml
  9  WHERE a.ID = b.SOME_ID
 10    AND b.OTHER_ID = c.OTHER_ID
 11    AND c.VAL = xml.VAL (+)
 12  GROUP BY a.id;

ID      B_VAL      C_VAL    XML_VAL
-- ---------- ---------- ----------
X           1          3          2

Can you run this test case?

like image 28
Vincent Malgrat Avatar answered Apr 06 '23 16:04

Vincent Malgrat