Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL sub-query WHERE filtering out too much

Shorter version:

SQLfiddle

Check the output-column lives_together. I want it to contain concatinated (comma-separated) IDs of the members who live at the same address as the current member, but isn't a parent or child to him/her.

So in the first row, John, I want the ID of Mary (only).
Not Josef since he is John's child, and not Victoria since she lives at another address (and she, too, is his child).

Right now I get nothing, somehow this is because of these rows in the query (in the third sub-query):

mem.id <> m3.parent1 AND
mem.id <> m3.parent2 AND
mem.parent1 <> m3.id AND
mem.parent2 <> m3.id AND

They are supposed to filter out all parents and children (which they do), but for some reason they also filter out other members as well (Mary in the example above).
Just one of these rows is enough to filter out Mary.
Another example is Mats and Gabriella, they should get eachother's IDs, but they don't.

(One column with all people living_together (including children) isn't enough, since I want to print the living_together persons before the children. This question is a development of this other question I asked a while ago. One big difference in the result I want is that it should never group people with different last names.)

Why does this happen?

Longer version

I have a table, members. A bit simplified it looks like this (you can see it in it's entirety in the SQLfiddle below):

+-------+------------+------------+------------+---------+----------+----------+-----------+-----------+---------------+
| id    | first_name | last_name  | birthdate  | parent1 | parent2  | address  | phones    | mobiles   | emails        |
+-------+------------+------------+------------+---------+----------+----------+-----------+-----------+---------------+
| 2490  | Mary       | Johansen   | 1964-01-24 | NULL    | NULL     | Street 1 | 1111      | 9999,8888 | [email protected] |
| 2491  | John       | Johansen   | 1968-01-21 | NULL    | NULL     | Street 1 | 1111,3333 | 7777      | [email protected] |
| 2422  | Brad       | Johansen   | 1983-01-07 | 2491    | 2490     | Street 1 | 2222,3333 | 6666      | [email protected] |
| 2493  | Victoria   | Andersen   | 1982-01-14 | 2490    | 2491     | Av. 2    | 4444      | 5555      | [email protected]  |
+-------+------------+------------+------------+---------+----------+----------+-----------+-----------+---------------+

Mary and John are married, with two kids; Josef and Victoria. Victoria has moved out.

I want to print an address list, which groups those who live at the same address. So Mary, John and Josef should get grouped, but Victoria should get retrieved separately. Children should be printed after spouses (and any other members living at that address), this is the reason for retrieving them separately.

Here's how my database really looks, and my entire query:

http://sqlfiddle.com/#!2/1f87c4/2 this is similar to actual data, the table contains about 400 rows.

So what I want is the lives_together output column to contain the group_concatinated IDs of the matched members. So I should get John's id in Mary's lives_together column, and the other way around. And not the children's ids.

I expect something similar to the following.
Note that Brad is shown after John despite having both a lower id and a name starting on a letter earlier in the alphabet, this is because he is a child in the family. The children should be sorted by birthdate (you can see that column in the
Also note that their phone numbers are added together with their last name, but their mobile numbers (and emails) are put with their names

+----------------------------------------------------
| Andersen         Av 2       4444
|     Victoria                5555              [email protected]
+----------------------------------------------------
| Johansen         Street 1   1111,2222,3333
|     John                    7777              [email protected]
|     Mary                    9999,8888         [email protected]
|     Brad                    6666              [email protected]
+----------------------------------------------------

This is, though, not what I expect from the query below, but my goal after some processing by PHP.

This is what I want from the query:

 +------+----------+-----------+------------+----------+-----------+-----------+---------------+-----------+-----------+----------------+
 | id   | lname    | fname     | birthdate  | address  | phones    | mobiles   | emails        | parents   | children  | lives_together |
 +------+----------+-----------+------------+----------+-----------+-----------+---------------+-----------+-----------+----------------+
 | 2490 | Johansen | Mary      | 1964-01-24 | Street 1 | 1111      | 9999,8888 | [email protected] | NULL      | 2424      | 2491           |
 +------+----------+-----------+------------+----------+-----------+-----------+---------------+-----------+-----------+----------------+
 | 2491 | Johansen | John      | 1968-01-21 | Street 1 | 1111,3333 | 7777      | [email protected] | NULL      | 2424      | 2490           |
 +------+----------+-----------+------------+----------+-----------+-----------+---------------+-----------+-----------+----------------+
 | 2422 | Johansen | Brad      | 1983-01-07 | Street 1 | 2222,3333 | 6666      | [email protected] | 2490,2491 | NULL      | NULL           |
 +------+----------+-----------+------------+----------+-----------+-----------+---------------+-----------+-----------+----------------+
 | 2493 | Andersen | Victoria  | 1982-01-14 | Av. 2    | 4444      | 5555      | [email protected]  | NULL      | NULL      | NULL           |
 +------+----------+-----------+------------+----------+-----------+-----------+---------------+-----------+-----------+----------------+

Here's my query (also somewhat simplified):

SELECT 
    mem.id as id,
    mem.last_name as lname,
    mem.first_name as fname,
    mem.birthdate as birthdate,
    mem.address as address,
    mem.phones as phones,
    mem.mobiles as mobiles

    (SELECT
        GROUP_CONCAT(m1.id)
        FROM 
            members m1
        WHERE 
            (mem.parent1 = m1.id OR mem.parent2 = m1.id) AND
            LOWER(REPLACE(mem.last_name, ' ', '')) = LOWER(REPLACE(m1.last_name, ' ', '')) AND
            LOWER(REPLACE(mem.address, ' ', '')) = LOWER(REPLACE(m1.address, ' ', '')) AND
            mem.id <> m1.id
    ) as parents,

    (SELECT 
        GROUP_CONCAT(m2.id)
        FROM 
            members m2
        WHERE 
            (mem.id = m2.parent1 OR mem.id = m2.parent2) AND
            LOWER(REPLACE(mem.last_name, ' ', '')) = LOWER(REPLACE(m2.last_name, ' ', '')) AND
            LOWER(REPLACE(mem.address, ' ', '')) = LOWER(REPLACE(m2.address, ' ', '')) AND
            mem.id <> m2.id 
    ) as children,

    (SELECT 
        GROUP_CONCAT(m3.id)
        FROM 
            members m3
        WHERE 
            mem.id <> m3.parent1 AND
            mem.id <> m3.parent2 AND
            mem.parent1 <> m3.id AND
            mem.parent2 <> m3.id AND
            LOWER(REPLACE(mem.last_name, ' ', '')) = LOWER(REPLACE(m3.last_name, ' ', '')) AND
            LOWER(REPLACE(mem.address, ' ', '')) = LOWER(REPLACE(m3.address, ' ', '')) AND
            mem.id <> m3.id 
    ) as lives_together 

FROM 
    members mem 

ORDER BY
    mem.last_name ASC,
    mem.first_name ASC 

When the query is at Mary, and goes through the third sub-query (lives_together) it should get John, but not Josef or Victoria. Not Josef because of mem.parent2 <> m3.id and not Victoria because of LOWER(REPLACE(mem.address, ' ', '')) = LOWER(REPLACE(m3.address, ' ', '')). That works, but for some reason John isn't got either, I just get NULL.

I get both John, Josef and Victoria if I remove this part:

mem.id <> m3.parent1 AND
mem.id <> m3.parent2 AND
mem.parent1 <> m3.id AND
mem.parent2 <> m3.id AND

But when it's there I get none of them. I don't understand why that part filters out John as well. (Note that these lines aren't simplified at all.) Just one of these for rows filters out John, but work as expected for Josef.

Is there something wrong with my code?

like image 510
Punchlinern Avatar asked Oct 17 '15 20:10

Punchlinern


People also ask

Can you use limit in subquery?

In Denodo,at this moment Limit is only supported at the end of a SELECT statement to reduce the size of the result set but it cannot be used in subqueries.

Why subquery is slower than join?

The advantage of a join includes that it executes faster. The retrieval time of the query using joins almost always will be faster than that of a subquery. By using joins, you can maximize the calculation burden on the database i.e., instead of multiple queries using one join query.

Can a subquery return multiple columns in MySQL?

If a subquery can return multiple columns and exactly one row, it is known as a row subquery. A row subquery is a derivation of a scalar subquery and can thus be used anywhere that a scalar subquery can be used. Finally, if a subquery can return multiple columns and multiple rows, it is known as a table subquery.


1 Answers

I want to print an address list, which groups those who live at the same address. So Mary, John and Josef should get grouped, but Victoria should get retrieved separately.

Perhaps you could look at this differently, by concentrating on the address? For example, this result:

|                                          Members | street_address |    postal_address | country |
|--------------------------------------------------|----------------|-------------------|---------|
|  John Andersson, Mary Andersson, Josef Andersson |       Street 1 |   61523 Stockholm |         |
|          Mats Anothername, Gabriella AnotherName |       Betmsv.4 | 641 93  Stockholm |         |
|                                   Marie Coolname |     Idrgatan 3 |  641 33 Stockholm |         |
|                                   Sofie Coolname |     Torvgen 12 | 641 53  Stockholm |         |
|                                 Victoria Johnson |       Avenue 3 |   61222 Stockholm |         |

Is produced by this query:

SELECT
      group_concat(DISTINCT concat( ' ',`first_name`, ' ',`last_name`)) as `Members`
    , mem.`street_address`
    , mem.`postal_address`
    , mem.`country`
FROM `members` as mem
WHERE `member_type` = 1
GROUP BY
      mem.`street_address`
    , mem.`postal_address`
    , mem.`country`
ORDER BY
      max(`last_name`)
    , `Members`
;

& with an example of using ORDER BY in the GROUP_CONCAT

SELECT
      group_concat(DISTINCT concat( ' ',`first_name`, ' ',`last_name`)
                   ORDER BY
                          case when `parent1` IS NULL and `parent2` IS NULL then 1 else 2 end
                        , `birth_date` ASC
                  ) as `Members`
    , mem.`street_address`
    , mem.`postal_address`
    , mem.`country`
FROM `members` as mem
WHERE `member_type` = 1
GROUP BY
      mem.`street_address`
    , mem.`postal_address`
    , mem.`country`
ORDER BY
      max(`last_name`)
    , `Members`
;
like image 95
Paul Maxwell Avatar answered Nov 08 '22 10:11

Paul Maxwell