This is a fairly new exercise for me but I need to find a way to identify pattern sequences within a table. So for example, lets say I have a simple table that resembles the following:
Now what I would like to do is identify and group all the records which have the sequenced pattern of values 5, 9 and 6 presenting them in a query. How would you accomplish this task using T-SQL?
The results should look like this:
I've looked for some potential examples of how this might be accomplished but couldn't find anything that really helps.
SQL pattern matching allows you to search for patterns in data if you don't know the exact word or phrase you are seeking. This kind of SQL query uses wildcard characters to match a pattern, rather than specifying it exactly. For example, you can use the wildcard "C%" to match any string beginning with a capital C.
SQL pattern matching enables you to use _ to match any single character and % to match an arbitrary number of characters (including zero characters). In MySQL, SQL patterns are case-insensitive by default. Some examples are shown here. Do not use = or <> when you use SQL patterns.
To number rows in a result set, you have to use an SQL window function called ROW_NUMBER() . This function assigns a sequential integer number to each result row.
LIKE clause is used to perform the pattern matching task in SQL. A WHERE clause is generally preceded by a LIKE clause in an SQL query.
You can use the following query wrapped in a CTE
in order to assign sequence numbers to the values contained in your sequence:
;WITH Seq AS (
SELECT v, ROW_NUMBER() OVER(ORDER BY k) AS rn
FROM (VALUES(1, 5), (2, 9), (3, 6)) x(k,v)
)
Output:
v rn
-------
5 1
9 2
6 3
Using the above CTE
you can identify islands, i.e. slices of sequential rows containing the whole of the sequence:
;WITH Seq AS (
SELECT v, ROW_NUMBER() OVER(ORDER BY k) AS rn
FROM (VALUES(1, 5), (2, 9), (3, 6)) x(k,v)
), Grp AS (
SELECT [Key], [Value],
ROW_NUMBER() OVER (ORDER BY [Key]) - rn AS grp
FROM mytable AS m
LEFT JOIN Seq AS s ON m.Value = s.v
)
SELECT *
FROM Grp
Output:
Key Value grp
-----------------
1 5 0
2 9 0
3 6 0
6 5 3
7 9 3
8 6 3
grp
field helps you identify exactly these islands.
All you need to do now is to just filter out partial groups:
;WITH Seq AS (
SELECT v, ROW_NUMBER() OVER(ORDER BY k) AS rn
FROM (VALUES(1, 5), (2, 9), (3, 6)) x(k,v)
), Grp AS (
SELECT [Key], [Value],
ROW_NUMBER() OVER (ORDER BY [Key]) - rn AS grp
FROM mytable AS m
LEFT JOIN Seq AS s ON m.Value = s.v
)
SELECT g1.[Key], g1.[Value]
FROM Grp AS g1
INNER JOIN (
SELECT grp
FROM Grp
GROUP BY grp
HAVING COUNT(*) = 3 ) AS g2
ON g1.grp = g2.grp
Demo here
Note: The initial version of this answer used an INNER JOIN
to Seq
. This won't work if table contains values like 5, 42, 9, 6
, as 42
will be filtered out by the INNER JOIN
and this sequence falsely identified as a valid one. Credit goes to @HABO for this edit.
Not very optimized, but I think propper answer:
CREATE TABLE pattern (
rowID INT IDENTITY(1,1) PRIMARY KEY,
rowValue INT NOT NULL
);
INSERT INTO pattern (rowValue) VALUES (5);
INSERT INTO pattern (rowValue) VALUES (9);
INSERT INTO pattern (rowValue) VALUES (6);
SELECT * FROM pattern;
SELECT Trg.* FROM Keys Trg
INNER JOIN pattern Pt ON (Trg.fValue = Pt.rowValue)
INNER JOIN (
SELECT K.fKey - P.rowID AS X, COUNT(*) AS Xc FROM Keys K
LEFT JOIN pattern P ON (K.fValue = P.rowValue)
WHERE
(P.rowID IS NOT NULL)
GROUP BY K.fKey - P.rowID
HAVING COUNT(*) = (SELECT COUNT(*) FROM pattern)
) Z ON (Trg.fKey - Pt.rowID = Z.X);
I use a table for pattern joining it to the main table. I calculate difference between the Key
and pattern Key
and I show only that rows which difference is matching (and rows count for the difference matching rows inside pattern table).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With