I'm trying to split a string with binary into an array of repeated characters.
For example, an array of 10001101
split with this function would be:
$arr[0] = '1';
$arr[1] = '000';
$arr[2] = '11';
$arr[3] = '0';
$arr[4] = '1';
(I tried to make myself clear, but if you still don't understand, my question is the same as this one but for PHP, not Python)
You can use preg_split
like so:
$in = "10001101";
$out = preg_split('/(.)(?!\1|$)\K/', $in);
print_r($out);
Array
(
[0] => 1
[1] => 000
[2] => 11
[3] => 0
[4] => 1
)
The regex:
(.)
- match a single character and capture it(?!\1|$)
- look at the next position and match if it's not the same as the one we just found nor the end of the string.\K
- keeps the text matched so far out of the overall regex match, making this match zero-width.Note: this does not work in PHP versions prior to 5.6.13 as there was a bug involving bump-along behavior with \K.
An alternative regex that works in earlier versions as well is:
$out = preg_split('/(?<=(.))(?!\1|$)/', $in);
This uses a lookbehind rather that \K
in order to make the match zero-width.
<?php
$s = '10001101';
preg_match_all('/((.)\2*)/',$s,$m);
print_r($m[0]);
/*
Array
(
[0] => 1
[1] => 000
[2] => 11
[3] => 0
[4] => 1
)
*/
?>
Matches repeated character sequences of 1 or more. The regex stores the subject character into the second capture group ((.)
, stored as $m[1]
), while the first capture group contains the entire repeat sequence (((.)\2*)
, stored as $m[0]
). With preg_match_all, it does this globally over the entire string. This can be applied for any string, e.g. 'aabbccddee'
. If you want to limit to just 0
and 1
, then use [01]
instead of .
in the second capture group.
Keep in mind $m may be empty, to first check if the result exists, i.e. isset($m[0])
, before you use it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With