I'm looking for a way to output a selected related record for each record in a table in MySQL. I'll explain further...
I have 2 tables currencies and exchange_rates. The tables are joined by a currency_code field and each currency record has multiple related exchange rate records, each exchange rate record represents a different day. So there is a 1:many relationship between currencies and exchange_rates.
I want to retrieve a full record from the exchange_rates
table for each currency but with the ability to define specific criteria as to which related record to select. Not just the most recent exchange_rate for each currency but maybe the most recent exchange_rates
record for each currency that has the field criteria_x=NULL
.
It's a shame that you can't use LIMIT
within a derived table otherwise something like this would be a neat and readable solution...
SELECT `currencies`.`currency_code`, `currencies`.`country`, `exchange_rates`.`id`,
FROM_UNIXTIME(`exchange_rates`.`datestamp`), `rate`
FROM `currencies`
INNER JOIN (
SELECT `id`, `currency_code`, `invoice_id`, `datestamp`, `rate`
FROM `exchange_rates`
WHERE `criteria_x`=NULL AND `criteria_y` LIKE 'A'
ORDER BY `datestamp` DESC
LIMIT 0, 1
) AS `exchange_rates` ON `currencies`.`currency_code`=`exchange_rates`.`currency_code`
ORDER BY `currencies`.`country`
The LIMIT
clause is applied to the parent query not the derived table.
This is the only way I've found to do this...
SELECT `currencies`.`currency_code`, `currencies`.`country`,
FROM_UNIXTIME( SUBSTRING_INDEX( SUBSTRING_INDEX(`exchange_rates`.`concat`, '-', 1), '-', -1)) AS `datestamp`,
SUBSTRING_INDEX( SUBSTRING_INDEX(`exchange_rates`.`concat`, '-', 2), '-', -1) AS `id`,
SUBSTRING_INDEX( SUBSTRING_INDEX(`exchange_rates`.`concat`, '-', 3), '-', -1) AS `rate`
FROM `currencies`
INNER JOIN (
SELECT `currency_code`, MAX(CONCAT_WS('-', `datestamp`, `id`, `rate`)) AS `concat`
FROM `exchange_rates`
WHERE `criteria_x`=NULL AND `criteria_y` LIKE 'A'
GROUP BY `exchange_rates`.`currency_code`
) AS `exchange_rates` ON `currencies`.`currency_code`=`exchange_rates`.`currency_code`
ORDER BY `currencies`.`country`
So concatenating a bunch of fields together and running a MAX()
on it to get my sort order within the group, then parsing those fields out in the parent query with SUBSTRING_INDEX()
. The problem is that this method only works when I can use a MIN()
or MAX()
on the concatenated field. It wouldn't be ideal if I wanted to sort a string or sort by multiple criteria but limit to a single record.
Also it causes me physical pain to have to resort to horrible string manipulation to get the data I want from a relational database — there has to be a better way!
Anyone got any suggestions of a better method?
There are a few general issues to discuss (briefly) before trying to provide an answer.
Your first query is:
SELECT `currencies`.`currency_code`, `currencies`.`country`, `exchange_rates`.`id`,
FROM_UNIXTIME(`exchange_rates`.`datestamp`), `rate`
FROM `currencies`
INNER JOIN (
SELECT `id`, `currency_code`, `invoice_id`, `datestamp`, `rate`
FROM `exchange_rates`
WHERE `criteria_x`=NULL AND `criteria_y` LIKE 'A'
ORDER BY `datestamp` DESC
LIMIT 0, 1
) AS `exchange_rates` ON `currencies`.`currency_code`=`exchange_rates`.`currency_code`
ORDER BY `currencies`.`country`
criteria_x = NULL
notation; that should be written as criteria_x IS NULL
. MySQL may allow it; as long as you are aware that it is non-standard, it is OK for you to use.LIKE 'A'
is not sensible if it contains no metacharacters (%
or _
in standard SQL). You'd be better off with simple equality: = 'A'
.Your question says:
I want to retrieve a full record from the
exchange_rates
table for each currency but with the ability to define specific criteria as to which related record to select. Not just the most recent exchange rate for each currency, but maybe the most recent exchange rate for each currency that has the fieldcriteria_x IS NULL
.
So, you want to select the most recent exchange rate record for each currency that meets the required other criteria. We can assume that there is a unique constraint on the combination of currency_code
and datestamp
in the exchange rate table; this means that there will always be at most one matching row. You've not specified what should be shown if there is no matching row; an inner join will simply not list that currency, of course.
With SQL queries, I usually build and test the overall query in steps, adding extra material to the previously developed queries that are known to work and produce the right output. If it is simple and/or I've collected too much hubris, I'll try a complex query first, but when (nemesis) it doesn't work, then I go back to the build and test process. Think of it as Test Driven (Query) Development.
SELECT id, currency_code, invoice_id, datestamp, rate
FROM exchange_rates
WHERE criteria_x IS NULL AND criteria_y = 'A'
ORDER BY currency_code, datestamp DESC
SELECT currency_code, MAX(datestamp)
FROM exchange_rates
WHERE criteria_x IS NULL AND criteria_y = 'A'
GROUP BY currency_code
SELECT x.id, x.currency_code, x.invoice_id, x.datestamp, x.rate
FROM exchange_rates AS x
JOIN (SELECT currency_code, MAX(datestamp) AS datestamp
FROM exchange_rates
WHERE criteria_x IS NULL AND criteria_y = 'A'
GROUP BY currency_code
) AS m
ON x.currency_code = m.currency_code AND x.datestamp = m.datestamp
This requires the joining the currencies table with the output of the previous query:
SELECT c.currency_code, c.country, r.id,
FROM_UNIXTIME(r.datestamp), r.rate
FROM currencies AS c
JOIN (SELECT x.id, x.currency_code, x.invoice_id, x.datestamp, x.rate
FROM exchange_rates AS x
JOIN (SELECT currency_code, MAX(datestamp) AS datestamp
FROM exchange_rates
WHERE criteria_x IS NULL AND criteria_y = 'A'
GROUP BY currency_code
) AS m
ON x.currency_code = m.currency_code AND x.datestamp = m.datestamp
) AS r
ON c.currency_code = r.currency_code
ORDER BY c.country
Except that Oracle only allows ') r
' instead of ') AS r
' for table aliases and the use of FROM_UNIXTIME()
, I believe that should work correctly with the current version of almost any SQL DBMS you care to mention.
Since the invoice ID is not returned in the final query, we can remove that from the select-list of the middle query. A good optimizer might do that automatically.
If you want to see the currency information even if there is no exchange rate that matches the criteria, then you need to change the JOIN in the outermost query to a LEFT JOIN (aka LEFT OUTER JOIN). If you only want to see a subset of the currencies, you can apply that filter at either the last (outermost) query stage, or (if the filter is based on information available in the exchange rate table, such as the currency code) at either the innermost sub-query (most efficient) or the middle sub-query (not so efficient unless the optimizer realizes it can push the filter down to the innermost sub-query).
Correctness is usually the primary criterion; performance is a secondary criterion. However, performance was mentioned in the question. The first rule is to measure the 'simple' query shown here. Only if that proves too slow do you need to worry further. When you do need to worry, you examine the query plan to see if there is, for example, a crucial index missing. Only if the query still isn't fast enough do you start trying to resort to other tricks. Those tricks tend to be very specific to a particular DBMS. For example, there might be optimizer hints that you can use to make the DBMS process the query differently.
If I've understood your problem correctly, all you need to do is self-join exchange_rates
to select the rate of interest:
SELECT currencies.currency_code,
currencies.country,
exchange_rates.id,
FROM_UNIXTIME(exchange_rates.datestamp),
exchange_rates.rate
FROM currencies
JOIN (
SELECT currency_code, MAX(datestamp) AS datestamp
FROM exchange_rates
WHERE criteria_x IS NULL AND criteria_y LIKE 'A'
GROUP BY currency_code
) AS exchange_wantd USING (currency_code)
JOIN exchange_rates USING (currency_code, datestamp)
ORDER BY currencies.country
Try this query. It is expected to work fine but if you provide some data i will be able to do it properly
SELECT `currencies`.`currency_code` as `CurrencyCode`,
`currencies`.`country`,
FROM_UNIXTIME( SUBSTRING_INDEX( SUBSTRING_INDEX(`exchange_rates`.`concat`, '-', 1), '-', -1)) AS `datestamp`,
SUBSTRING_INDEX( SUBSTRING_INDEX(`exchange_rates`.`concat`, '-', 2), '-', -1) AS `id`,
SUBSTRING_INDEX( SUBSTRING_INDEX(`exchange_rates`.`concat`, '-', 3), '-', -1) AS `rate`,
(SELECT
MAX(CONCAT_WS('-', `datestamp`, `id`, `rate`)) AS `concat`
FROM `exchange_rates`
WHERE `criteria_x`= NULL
AND `criteria_y` LIKE 'A'
GROUP BY `exchange_rates`.`currency_code`
HAVING `exchange_rates`.`currency_code` =`CurrencyCode`
) as `Concat`
FROM `currencies`
ORDER BY `currencies`.`country`
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With