Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL query multiple tables, with multiple joins and column field with comma separated list

I have a query where I join three separate tables (node, control, service).

Below is their column headings and sample data.

NODE TABLE  (contains over 7000 rows)
nodeID | host    | serviceID        | controlID
     1 | server1 | 1,2,3,4,9,50,200 |         1
     2 | server2 | 2,3,4,9,200      |         2
     3 | server3 | 1,2,3,4,9,50,200 |         2
     4 | server4 | 1,2,50,200       |         3
     5 | server5 | 1,4              |         3

CONTROL TABLE  (contains roughly 50 rows)
controlID | name
        1 | Control Name One
        2 | Control Name Two
        3 | Control Name Three
        4 | Control Name Four
        5 | Control Name Five

SERVICE TABLE (contains roughly 3000 rows)
serviceID | name
        1 | Service Name One
        2 | Service Name Two
        3 | Service Name Three
        4 | Service Name Four
        5 | Service Name Five
        6 | Service Name Six
       50 | Service Name 50
      200 | Service Name 200

As you can see, the database tables have a bit of normalization with the exception of the node.serviceID column. I whole heartily agree that node.serviceID should be normalized and a pivot table of one-to-many created. No argument there. However, I do not control the scripts that insert the information into the database. I can only read from the tables and format the data how I can.

So, below is the SQL query I wrote that does work but the, as expected, node.serviceID does not join well with service.serviceID. Please note that I am not using a SELECT * in my final query, I select about 20 fields from the node table and do not want to make the query more confusing. Below is just an example.

SELECT *
FROM node AS a
LEFT JOIN control AS b ON a.controlID = b.controlid
LEFT JOIN service AS c ON a.serviceID = c.serviceId
ORDER BY a.host

The query above spits out something similar:

Host      Control              Services
server1   Control Name One     1,2,3,4,9,50
server2   Control Name Three   1,2,9,50
server3   Control Name Two     4
server4   Control Name Four    1,2,3,4,9
server5   Control Name Two     1,2,3,50
server6   Control Name Five    1,3,4,9,50

What I am looking for is this:

Host      Control              Services
server1   Control Name One     Service Name One,
                               Service Name Two,
                               Service Name Three,
                               Service Name Four,
                               Service Name Nine,
                               Service Name Fifty
server2   Control Name Three   Service Name One,
                               Service Name Two,
                               Service Name Nine,
                               Service Name Fifty
server3   Control Name Two     Service Name Four
server4   Control Name Four    Service Name One,
                               Service Name Two,
                               Service Name Three,
                               Service Name Four,
                               Service Name Nine

I have scoured stackoverflow.com for someone with an issue like this but I can only find either joining multiple tables on ID and name OR someone expanding a list of IDs but not both together.

This one came close: Using id that are comma separated sql but not quite.

I have tried various methods of CFML with ListToArray() and tried looping over them with an index but nothing would work for me.

The server I snag the data from is MySQL 5.1 and I am using a combination of jQuery and ColdFusion (Railo 4.2) to format the data.

This is my first time posting on stackoverflow, so my apologies if there really is an answer to this, I did not search long enough, and would make this question a duplicate.

----------------- UPDATE --------------------

I tried the query and CFML suggested by Leigh.

So, I get the following:

server1 Service Name One , Service Name One , Service Name One , Service Name One , Service Name One , Service Name One , Service Name One , Service Name Two , Service Name Two , Service Name Two , Service Name Two , Service Name Two , Service Name Two , Service Name Three , Service Name Four , Service Name Four , Service Name Four , Service Name Four , Service Name Four , Service Name Four , Service Name Four

I am not sure, at this point, if that is just a little bit of change with the CFML or something in the SQL query. But, it does look promising.

like image 648
Grimdari Avatar asked Aug 21 '14 23:08

Grimdari


1 Answers

If you really cannot modify the table structure, probably the best you can do is one of the old list hacks:

  • Use a JOIN with FIND_IN_SET(value, commaSeparatedString)

    SELECT n.Host, c.Name AS ControlName, s.Name AS ServiceName FROM node n LEFT JOIN control c ON c.controlID = n.controlID LEFT JOIN service s ON FIND_IN_SET(s.serviceID, n.serviceId) ORDER BY n.host, s.Name ;

  • Use LIKE to detect the presence of a specific serviceID value within the node list

    SELECT n.Host, c.Name AS ControlName, s.Name AS ServiceName FROM node n LEFT JOIN control c ON c.controlID = n.controlID LEFT JOIN service s ON CONCAT(',', n.serviceID,',') LIKE CONCAT('%,', s.serviceID,',%') ORDER BY n.host, s.Name ;

SQLFiddle

However, as you already noted that column really should be normalized. While the methods above should work for small data sets, they suffer from the usual problems of working with "lists". Neither method is very index friendly, and as a result, will not scale well. Also, both perform string comparisons. So the slightest difference may cause the matching to fail. For example, 1,4 would match two serviceID's, whereas 1,(space)4 or 1,4.0 would match only one.

Update based on comments:

On second read, I am not sure the above answers the precise question you are asking, but it should provide a good basis to work with ...

If you no longer want a CSV list, just use one of the queries above and output the individual query columns as usual. The result will be one service name per row, ie:

   server1 | Control Name One | Service Name 200
   server1 | Control Name One | Service Name 50
   ..

Otherwise, if you need to preserve the comma separated values, one possibility is to use a <cfoutput group=".."> on the query results. Since the results are ordered by "Host" first, something like the code below. NB: For "group" to work properly, the results must be ordered by Host and you must use multiple cfoutput tags as shown below.

 <cfoutput query="..." group="Host"> 
    #Host# |
    #ControlName# |
    <cfoutput>
      #ServiceName#,
    </cfoutput>
    <br>
 </cfoutput>

The result should look like this:

server1 | Control Name One | Service Name 200, Service Name 50, Service Name Four, Service Name One, Service Name Three, Service Name Two, 
server2 | Control Name Two | Service Name 200, Service Name Four, Service Name Three, Service Name Two, 
server3 | Control Name Two | Service Name 200, Service Name 50, Service Name Four, Service Name One, Service Name Three, Service Name Two, 
server4 | Control Name Three | Service Name 200, Service Name 50, Service Name One, Service Name Two, 
server5 | Control Name Three | Service Name Four, Service Name One, 


Update 2:

I forgot there is a simpler alternative to cfoutput group in MySQL: GROUP_CONCAT

<cfquery name="qry" datasource="MySQL5">
   SELECT n.Host, c.Name AS ControlName, GROUP_CONCAT(s.Name) AS ServiceNameList 
   FROM node n 
        LEFT JOIN control c ON c.controlID = n.controlID 
        LEFT JOIN service s ON FIND_IN_SET(s.serviceID, n.serviceId) 
   GROUP BY n.Host, c.Name
   ORDER BY n.host
</cfquery>
like image 122
Leigh Avatar answered Oct 21 '22 13:10

Leigh