With geographic data records like this:
START | END
CITY1 | STATE1 | CITY2 | STATE2
----------------------------------------------
New York | NY | Boston | MA
Newark | NJ | Albany | NY
Cleveland| OH | Cambridge | MA
I would like to output something like this where it counts START/END pairings displayed as a matrix:
| MA | NJ | NY | OH
------------------------------
MA | 0 | 0 | 1 | 0
NJ | 0 | 0 | 1 | 0
NY | 1 | 0 | 0 | 0
OH | 1 | 0 | 0 | 0
I can see how GROUP BY
and COUNT
will find the data but I'm lost on how to display as a matrix. Does anyone have any ideas?
Conclusion. As you can see, SQL Server does not include arrays. But we can use table variables, temporary tables or the STRING_SPLIT function. However, the STRING_SPLIT function is new and can be used only on SQL Server 2016 or later versions.
Whereas an array is merely a data structure who elements are accessed by a numeric value called an index, a matrix is an array with mathematical operations defined on it. A matrix can be one, two, three or more dimensional structures.
This seems to do the trick, tested on PostgreSQL 9.1. It will almost certainly need to be adapted for SQL Server (anyone feel free to update my answer to that effect).
SELECT start AS state,
SUM((dest = 'MA')::INT) AS MA,
SUM((dest = 'NJ')::INT) AS NJ,
SUM((dest = 'NY')::INT) AS NY,
SUM((dest = 'OH')::INT) AS OH
FROM (
SELECT state1 AS start, state2 AS dest
FROM routes
UNION ALL
SELECT state2 AS start, state1 AS dest
FROM routes
) AS s
GROUP BY start
ORDER BY start;
However note that my output is slightly different than yours--I'm not sure if that's because your sample output is wrong, or because I misunderstood your requirements:
state | ma | nj | ny | oh
-------+----+----+----+----
MA | 0 | 0 | 1 | 1
NJ | 0 | 0 | 1 | 0
NY | 1 | 1 | 0 | 0
OH | 1 | 0 | 0 | 0
(4 rows)
This query works by querying the table twice, once for the state1 -> state2 routes, and a second time for the state2 -> state1 routes, then joins them together with UNION ALL
.
Then for each destination state, it runs a SUM()
for that row's origin state.
This strategy should be easy to adapt for any RDBMS.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With