Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL get all data from specific related record

Tags:

join

mysql

I'm looking for a way to output a selected related record for each record in a table in MySQL. I'll explain further...

I have 2 tables currencies and exchange_rates. The tables are joined by a currency_code field and each currency record has multiple related exchange rate records, each exchange rate record represents a different day. So there is a 1:many relationship between currencies and exchange_rates.

I want to retrieve a full record from the exchange_rates table for each currency but with the ability to define specific criteria as to which related record to select. Not just the most recent exchange_rate for each currency but maybe the most recent exchange_rates record for each currency that has the field criteria_x=NULL.

It's a shame that you can't use LIMIT within a derived table otherwise something like this would be a neat and readable solution...

SELECT `currencies`.`currency_code`, `currencies`.`country`, `exchange_rates`.`id`,
       FROM_UNIXTIME(`exchange_rates`.`datestamp`), `rate` 
FROM `currencies` 
INNER JOIN (
SELECT `id`, `currency_code`, `invoice_id`, `datestamp`, `rate` 
FROM `exchange_rates` 
WHERE `criteria_x`=NULL AND `criteria_y` LIKE 'A' 
ORDER BY `datestamp` DESC
LIMIT 0, 1
) AS `exchange_rates` ON `currencies`.`currency_code`=`exchange_rates`.`currency_code`
ORDER BY `currencies`.`country`

The LIMIT clause is applied to the parent query not the derived table.

This is the only way I've found to do this...

SELECT `currencies`.`currency_code`, `currencies`.`country`, 
FROM_UNIXTIME( SUBSTRING_INDEX( SUBSTRING_INDEX(`exchange_rates`.`concat`, '-', 1), '-', -1)) AS `datestamp`,
SUBSTRING_INDEX( SUBSTRING_INDEX(`exchange_rates`.`concat`, '-', 2), '-', -1) AS `id`, 
SUBSTRING_INDEX( SUBSTRING_INDEX(`exchange_rates`.`concat`, '-', 3), '-', -1) AS `rate` 
FROM `currencies`
INNER JOIN (
SELECT `currency_code`, MAX(CONCAT_WS('-', `datestamp`, `id`, `rate`)) AS `concat`
FROM `exchange_rates` 
WHERE `criteria_x`=NULL AND `criteria_y` LIKE 'A' 
GROUP BY `exchange_rates`.`currency_code`
) AS `exchange_rates` ON `currencies`.`currency_code`=`exchange_rates`.`currency_code`
ORDER BY `currencies`.`country`

So concatenating a bunch of fields together and running a MAX() on it to get my sort order within the group, then parsing those fields out in the parent query with SUBSTRING_INDEX(). The problem is that this method only works when I can use a MIN() or MAX() on the concatenated field. It wouldn't be ideal if I wanted to sort a string or sort by multiple criteria but limit to a single record.

Also it causes me physical pain to have to resort to horrible string manipulation to get the data I want from a relational database — there has to be a better way!

Anyone got any suggestions of a better method?

like image 572
batfastad Avatar asked May 03 '12 15:05

batfastad


3 Answers

There are a few general issues to discuss (briefly) before trying to provide an answer.

Your first query is:

SELECT `currencies`.`currency_code`, `currencies`.`country`, `exchange_rates`.`id`,
       FROM_UNIXTIME(`exchange_rates`.`datestamp`), `rate` 
FROM `currencies` 
INNER JOIN (
SELECT `id`, `currency_code`, `invoice_id`, `datestamp`, `rate` 
FROM `exchange_rates` 
WHERE `criteria_x`=NULL AND `criteria_y` LIKE 'A' 
ORDER BY `datestamp` DESC
LIMIT 0, 1
) AS `exchange_rates` ON `currencies`.`currency_code`=`exchange_rates`.`currency_code`
ORDER BY `currencies`.`country`
  1. I don't think you need to use as many back-quotes as you are using. They aren't exactly wrong, but I'm not going to type them in my answer.
  2. The SQL standard does not sanction the criteria_x = NULL notation; that should be written as criteria_x IS NULL. MySQL may allow it; as long as you are aware that it is non-standard, it is OK for you to use.
  3. The criterion LIKE 'A' is not sensible if it contains no metacharacters (% or _ in standard SQL). You'd be better off with simple equality: = 'A'.

Your question says:

I want to retrieve a full record from the exchange_rates table for each currency but with the ability to define specific criteria as to which related record to select. Not just the most recent exchange rate for each currency, but maybe the most recent exchange rate for each currency that has the field criteria_x IS NULL.

So, you want to select the most recent exchange rate record for each currency that meets the required other criteria. We can assume that there is a unique constraint on the combination of currency_code and datestamp in the exchange rate table; this means that there will always be at most one matching row. You've not specified what should be shown if there is no matching row; an inner join will simply not list that currency, of course.

With SQL queries, I usually build and test the overall query in steps, adding extra material to the previously developed queries that are known to work and produce the right output. If it is simple and/or I've collected too much hubris, I'll try a complex query first, but when (nemesis) it doesn't work, then I go back to the build and test process. Think of it as Test Driven (Query) Development.

Stage 1: Exchange rate records that match specified criteria

SELECT id, currency_code, invoice_id, datestamp, rate 
  FROM exchange_rates 
 WHERE criteria_x IS NULL AND criteria_y = 'A' 
 ORDER BY currency_code, datestamp DESC

Stage 2: Most recent exchange rate time for each currency that matches specified criteria

SELECT currency_code, MAX(datestamp) 
  FROM exchange_rates 
 WHERE criteria_x IS NULL AND criteria_y = 'A' 
 GROUP BY currency_code

Stage 3: Exchange rate record for most recent exchange rate time for each currency that matches specified criteria

SELECT x.id, x.currency_code, x.invoice_id, x.datestamp, x.rate 
  FROM exchange_rates AS x
  JOIN (SELECT currency_code, MAX(datestamp) AS datestamp
          FROM exchange_rates 
         WHERE criteria_x IS NULL AND criteria_y = 'A' 
         GROUP BY currency_code
       ) AS m
    ON x.currency_code = m.currency_code AND x.datestamp = m.datestamp

Stage 4: Currency information and exchange rate record for most recent exchange rate time for each currency that matches specified criteria

This requires the joining the currencies table with the output of the previous query:

SELECT c.currency_code, c.country, r.id,
       FROM_UNIXTIME(r.datestamp), r.rate
  FROM currencies AS c 
  JOIN (SELECT x.id, x.currency_code, x.invoice_id, x.datestamp, x.rate 
          FROM exchange_rates AS x
          JOIN (SELECT currency_code, MAX(datestamp) AS datestamp
                  FROM exchange_rates 
                 WHERE criteria_x IS NULL AND criteria_y = 'A' 
                 GROUP BY currency_code
               ) AS m
            ON x.currency_code = m.currency_code AND x.datestamp = m.datestamp
       ) AS r
    ON c.currency_code = r.currency_code
 ORDER BY c.country

Except that Oracle only allows ') r' instead of ') AS r' for table aliases and the use of FROM_UNIXTIME(), I believe that should work correctly with the current version of almost any SQL DBMS you care to mention.

Since the invoice ID is not returned in the final query, we can remove that from the select-list of the middle query. A good optimizer might do that automatically.

If you want to see the currency information even if there is no exchange rate that matches the criteria, then you need to change the JOIN in the outermost query to a LEFT JOIN (aka LEFT OUTER JOIN). If you only want to see a subset of the currencies, you can apply that filter at either the last (outermost) query stage, or (if the filter is based on information available in the exchange rate table, such as the currency code) at either the innermost sub-query (most efficient) or the middle sub-query (not so efficient unless the optimizer realizes it can push the filter down to the innermost sub-query).

Correctness is usually the primary criterion; performance is a secondary criterion. However, performance was mentioned in the question. The first rule is to measure the 'simple' query shown here. Only if that proves too slow do you need to worry further. When you do need to worry, you examine the query plan to see if there is, for example, a crucial index missing. Only if the query still isn't fast enough do you start trying to resort to other tricks. Those tricks tend to be very specific to a particular DBMS. For example, there might be optimizer hints that you can use to make the DBMS process the query differently.

like image 57
Jonathan Leffler Avatar answered Oct 17 '22 01:10

Jonathan Leffler


If I've understood your problem correctly, all you need to do is self-join exchange_rates to select the rate of interest:

SELECT   currencies.currency_code,
         currencies.country,
         exchange_rates.id,
         FROM_UNIXTIME(exchange_rates.datestamp),
         exchange_rates.rate
FROM     currencies
  JOIN   (
    SELECT   currency_code, MAX(datestamp) AS datestamp
    FROM     exchange_rates
    WHERE    criteria_x IS NULL AND criteria_y LIKE 'A'
    GROUP BY currency_code
  )   AS exchange_wantd USING (currency_code)
  JOIN   exchange_rates USING (currency_code, datestamp)
ORDER BY currencies.country
like image 23
eggyal Avatar answered Oct 17 '22 00:10

eggyal


Try this query. It is expected to work fine but if you provide some data i will be able to do it properly

SELECT  `currencies`.`currency_code` as `CurrencyCode`,
    `currencies`.`country`, 
    FROM_UNIXTIME( SUBSTRING_INDEX( SUBSTRING_INDEX(`exchange_rates`.`concat`, '-', 1), '-', -1)) AS `datestamp`,
    SUBSTRING_INDEX( SUBSTRING_INDEX(`exchange_rates`.`concat`, '-', 2), '-', -1) AS `id`, 
    SUBSTRING_INDEX( SUBSTRING_INDEX(`exchange_rates`.`concat`, '-', 3), '-', -1) AS `rate`,
    (SELECT 
            MAX(CONCAT_WS('-', `datestamp`, `id`, `rate`)) AS `concat` 
            FROM `exchange_rates` 
            WHERE `criteria_x`= NULL 
            AND `criteria_y` LIKE 'A' 
            GROUP BY `exchange_rates`.`currency_code`
            HAVING `exchange_rates`.`currency_code` =`CurrencyCode`
    ) as `Concat`
FROM    `currencies`
ORDER BY `currencies`.`country` 
like image 1
Muhammad Raheel Avatar answered Oct 17 '22 00:10

Muhammad Raheel