Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to replace every other instance of a particular character in a MySQL string?

How to replace value in mysql column by query like, Column is options and its of type varchar(255)

From

id   options
1    A|10|B|20|C|30
2    A|Positive|B|Negative

To

id   options
1    A|10,B|20,C|30
2    A|Positive,B|Negative

I am doing it by php like this.

<?php
    $str =  "A|10|B|20|C|30";
    $arr = explode("|",$str);
    $newArr = array();
    for($i=0;$i<count($arr);$i+=2){
      if($arr[$i] && $arr[$i+1]){
        $newArr[] = $arr[$i]."|".$arr[$i+1];
      }
    }
    echo "Before:".$str."\n";
    echo "After :".implode(",",$newArr);
?>

https://eval.in/841007

So instead of PHP, I want to do this in MySQL.

like image 342
GRESPL Nagpur Avatar asked Aug 03 '17 08:08

GRESPL Nagpur


2 Answers

You should consider to store your data in a normalized schema. In your case the table should look like:

| id | k |        v |
|----|---|----------|
|  1 | A |       10 |
|  1 | B |       20 |
|  1 | C |       30 |
|  2 | A | Positive |
|  2 | B | Negative |

This schema is more flexible and you will see why.

So how to convert the given data into the new schema? You will need a helper table containing sequence numbers. Since your column is varchar(255) you can only store 128 values (+ 127 delimiters) in it. But let's just create 1000 numbers. You can use any table with enough rows. But since any MySQL server has the information_schema.columns table, I will use it.

drop table if exists helper_sequence;
create table helper_sequence (i int auto_increment primary key)
    select null as i
    from information_schema.columns c1
    join information_schema.columns c2
    limit 1000;

We will use this numbers as position of the values in your string by joining the two tables.

To extract a value from a delimited string you can use the substring_index() function. The value at position i will be

substring_index(substring_index(t.options, '|', i  ), '|', -1)

In your string you have a sequence of keys followed by its values. The position of a key is an odd number. So if the position of the key is i, the position of the corresponding value will be i+1

To get the number of the delimiters in the string and limit our join we can use

char_length(t.options) - char_length(replace(t.options, '|', ''))

The query to store the data in a normalized form would be:

create table normalized_table
    select t.id
        , substring_index(substring_index(t.options, '|', i  ), '|', -1) as k
        , substring_index(substring_index(t.options, '|', i+1), '|', -1) as v
    from old_table t
    join helper_sequence s
      on s.i <= char_length(t.options) - char_length(replace(t.options, '|', ''))
    where s.i % 2 = 1

Now run select * from normalized_table and you will get this:

| id | k |        v |
|----|---|----------|
|  1 | A |       10 |
|  1 | B |       20 |
|  1 | C |       30 |
|  2 | A | Positive |
|  2 | B | Negative |

So why is this format a better choice? Besides many other reasons, one is that you can easily convert it to your old schema with

select id, group_concat(concat(k, '|', v) order by k separator '|') as options
from normalized_table
group by id;

| id |               options |
|----|-----------------------|
|  1 |        A|10|B|20|C|30 |
|  2 | A|Positive|B|Negative |

or to your desired format

select id, group_concat(concat(k, '|', v) order by k separator ',') as options
from normalized_table
group by id;

| id |               options |
|----|-----------------------|
|  1 |        A|10,B|20,C|30 |
|  2 | A|Positive,B|Negative |

If you don't care about normalization and just want this task to be done, you can update your table with

update old_table o
join (
    select id, group_concat(concat(k, '|', v) order by k separator ',') as options
    from normalized_table
    group by id
) n using (id)
set o.options = n.options;

And drop the normalized_table.

But then you won't be able to use simple queries like

select *
from normalized_table
where k = 'A'

See demo at rextester.com

like image 189
Paul Spiegel Avatar answered Oct 01 '22 12:10

Paul Spiegel


Not using stored procedures, I would do it in 2 steps:

  1. Insert the comma at the second occurrence of the pipe character:

    update options set options = insert(options, locate('|', options, locate('|', options) + 1), 1, ',');
    
  2. Insert the remaining commas - execute the query N times:

    update options set options = insert(options, locate('|', options, locate('|', options, length(options) - locate(',', reverse(options)) + 1) + 1), 1, ',');
    

    where N =

    select max(round(((length(options) - length(replace(options, '|', ''))) - 1 ) / 2) - 1) from options;
    

    (or don't bother with counting and continue to execute the query as long as it doesn't tell you "0 rows affected")

Checked with this set of data:

id   options
1    A|10|B|20|C|30
2    A|Positive|B|Negative
3    A|10|B|20|C|30|D|40|E|50|F|60
4    A|Positive|B|Negative|C|Neutral|D|Dunno

results in:

id   options
1    A|10,B|20,C|30
2    A|Positive,B|Negative
3    A|10,B|20,C|30,D|40,E|50,F|60
4    A|Positive,B|Negative,C|Neutral,D|Dunno

(I'll provide an explanation later)

like image 23
Taz Avatar answered Oct 01 '22 10:10

Taz