I have a text field which looks like:
option[A]sum[A]g3et[B]
I want to get the text which is inside the [ ]
without duplicates.
Meaning to get:
A
B
There can't be a case of double like [ [ ] ]
.
I know this is a horrible way to save data in databases. I can not change how the data is saved. I just need to get a very specific (one time) information from this column.
I tried to do:
SELECT substring_index(substring_index(sentence, '[', -1),']', 1)
FROM (SELECT 'THIS[A] IS A TEST' AS sentence) temp;
This gives me A
, but it will not work for many []
.
I thought of using regex however I don't know how many [ ]
I have.
How do I do that?
It is not job for DB
but it is possible:
CREATE TABLE tab(id INT, col VARCHAR(100));
INSERT INTO tab(id, col)
VALUES (1, 'option[A]sum[A]g3et[B]'), (2, '[Cosi]sum[A]g3et[ZZZZ]');
SELECT DISTINCT *
FROM (
SELECT id, RIGHT(val, LENGTH(val) - LOCATE('[', val)) AS val
FROM
(
SELECT id, SUBSTRING_INDEX(SUBSTRING_INDEX(t.col, ']', n.n), ']', -1) AS val
FROM tab t
CROSS JOIN
(
SELECT a.N + b.N * 10 + 1 n
FROM
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
) n
WHERE n.n <= 1 + (LENGTH(t.col) - LENGTH(REPLACE(t.col, ']', '')))
) sub
) s
WHERE val <> ''
ORDER BY ID;
SqlFiddleDemo
Note:
Depending on col
maximum length you may need to generate more numbers in CROSS JOIN
section. For now it is up to 100.
Output:
How it works:
CROSS JOIN
]
as delimeterRIGHT(val, LENGTH(val) - LOCATE('[', val))
remove the part up to [
DISTINCT
valuesInner most query:
╔════╦══════════╗
║ id ║ val ║
╠════╬══════════╣
║ 1 ║ option[A ║
║ 1 ║ sum[A ║
║ 1 ║ g3et[B ║
║ 1 ║ ║
╚════╩══════════╝
Second subquery:
╔════╦═════╗
║ id ║ val ║
╠════╬═════╣
║ 1 ║ A ║
║ 1 ║ A ║
║ 1 ║ B ║
║ 1 ║ ║
╚════╩═════╝
And outermost query:
╔════╦═════╗
║ id ║ val ║
╠════╬═════╣
║ 1 ║ A ║
║ 1 ║ B ║
╚════╩═════╝
I need the result of query per row.. not combined
So add simple:
WHERE n.n <= 1 + (LENGTH(t.col) - LENGTH(REPLACE(t.col, ']', '')))
AND t.id = ?
EDIT 2:
see http://sqlfiddle.com/#!9/8ee95/1 your query works partially for my data. I also changed the type to longtext.
You want to parse JSON in MySQL. As I said before parse and get value in application layer. This answer is only for demo/toys purpose and will have very low performamce.
If you still insist on SQL solution:
SELECT id, val,s.n
FROM (
SELECT id, RIGHT(val, LENGTH(val) - LOCATE('[', val)) AS val,n
FROM
(
SELECT id, SUBSTRING_INDEX(SUBSTRING_INDEX(t.col, ']', n.n), ']', -1) AS val, n.n
FROM (SELECT id, REPLACE(col, '[]','') as col FROM tab) t
CROSS JOIN
(
SELECT e.N * 10000 + d.N * 1000 + c.N * 100 + a.N + b.N * 10 + 1 n
FROM
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) c
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) d
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) e
) n
WHERE n.n <= 1 + (LENGTH(t.col) - LENGTH(REPLACE(t.col, ']', '')))
) sub
) s
WHERE val <> ''
GROUP BY id, val
HAVING n <> MAX(n)
ORDER BY id,n;
SqlFiddleDemo
Output:
╔═════╦═════════════╦════╗
║ id ║ val ║ n ║
╠═════╬═════════════╬════╣
║ 1 ║ CE31285LV4 ║ 1 ║
║ 1 ║ D32E ║ 3 ║
║ 1 ║ GTX750 ║ 5 ║
║ 1 ║ M256S ║ 7 ║
║ 1 ║ H2X1T ║ 9 ║
║ 1 ║ FMLANE4U4 ║ 11 ║
╚═════╩═════════════╩════╝
EDIT 3:
What exactly is done there? Why do you need n
CROSS JOIN
and entire subquery is only tally table. That'is all. If MySQL
has function to generate number sequence (like generate_series
or prepopulated number table there is no need for CROSS JOIN
.
Number table is needed for SUBSTRING_INDEX
:
SUBSTRING_INDEX(str,delim,count)
Returns the substring from string str before count occurrences of the delimiter delim. If count is positive, everything to the left of the final delimiter (counting from the left) is returned. If count is negative, everything to the right of the final delimiter (counting from the right) is returned. SUBSTRING_INDEX() performs a case-sensitive match when searching for delim.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With