Given two equal-length strings, is there an elegant way to get the offset of the first different character? The obvious solution would be: <pre class="prettyprint"><code>for ($offset = 0; $offset < $length; ++$offset) { if ($str1[$offset] !== $str2[$offset]) { return $offset; } } </code></pre> But that doesn't look quite right, for such a simple task.

You can use a nice property of bitwise XOR (<code>^</code>) to achieve this: Basically, when you xor two strings together, the characters that are the same will become null bytes (<code>"\0"</code>). So if we xor the two strings, we just need to find the position of the first non-null byte using <code>strspn</code>: <pre class="prettyprint"><code>$position = strspn($string1 ^ $string2, "\0"); </code></pre> That's all there is to it. So let's look at an example: <pre class="prettyprint"><code>$string1 = 'foobarbaz'; $string2 = 'foobarbiz'; $pos = strspn($string1 ^ $string2, "\0"); printf( 'First difference at position %d: "%s" vs "%s"', $pos, $string1[$pos], $string2[$pos] ); </code></pre> That will output: <blockquote> First difference at position 7: "a" vs "i" </blockquote> So that should do it. It's very efficient since it's only using C functions, and requires only a single copy of memory of the string. <h3>Edit: A MultiByte Solution Along The Same Lines:</h3> <pre class="prettyprint"><code>function getCharacterOffsetOfDifference($str1, $str2, $encoding = 'UTF-8') { return mb_strlen( mb_strcut( $str1, 0, strspn($str1 ^ $str2, "\0"), $encoding ), $encoding ); } </code></pre> First the difference at the byte level is found using the above method and then the offset is mapped to the character level. This is done using the <code>mb_strcut</code> function, which is basically <code>substr</code> but honoring multibyte character boundaries. <pre class="prettyprint"><code>var_dump(getCharacterOffsetOfDifference('foo', 'foa')); // 2 var_dump(getCharacterOffsetOfDifference('©oo', 'foa')); // 0 var_dump(getCharacterOffsetOfDifference('f©o', 'fªa')); // 1 </code></pre> It's not as elegant as the first solution, but it's still a one-liner (and if you use the default encoding a little bit simpler): <pre class="prettyprint"><code>return mb_strlen(mb_strcut($str1, 0, strspn($str1 ^ $str2, "\0"))); </code></pre>

Find first character that is different between two strings

Tags:

string

php

Given two equal-length strings, is there an elegant way to get the offset of the first different character?

The obvious solution would be:

for ($offset = 0; $offset < $length; ++$offset) {     if ($str1[$offset] !== $str2[$offset]) {         return $offset;     } }

But that doesn't look quite right, for such a simple task.

446

asked Sep 19 '11 18:09

NikiC

1 Answers

You can use a nice property of bitwise XOR (^) to achieve this: Basically, when you xor two strings together, the characters that are the same will become null bytes ("\0"). So if we xor the two strings, we just need to find the position of the first non-null byte using strspn:

$position = strspn($string1 ^ $string2, "\0");

That's all there is to it. So let's look at an example:

$string1 = 'foobarbaz'; $string2 = 'foobarbiz'; $pos = strspn($string1 ^ $string2, "\0");  printf(     'First difference at position %d: "%s" vs "%s"',     $pos, $string1[$pos], $string2[$pos] );

That will output:

First difference at position 7: "a" vs "i"

So that should do it. It's very efficient since it's only using C functions, and requires only a single copy of memory of the string.

Edit: A MultiByte Solution Along The Same Lines:

function getCharacterOffsetOfDifference($str1, $str2, $encoding = 'UTF-8') {     return mb_strlen(         mb_strcut(             $str1,             0, strspn($str1 ^ $str2, "\0"),             $encoding         ),         $encoding     ); }

First the difference at the byte level is found using the above method and then the offset is mapped to the character level. This is done using the mb_strcut function, which is basically substr but honoring multibyte character boundaries.

var_dump(getCharacterOffsetOfDifference('foo', 'foa')); // 2 var_dump(getCharacterOffsetOfDifference('©oo', 'foa')); // 0 var_dump(getCharacterOffsetOfDifference('f©o', 'fªa')); // 1

It's not as elegant as the first solution, but it's still a one-liner (and if you use the default encoding a little bit simpler):

return mb_strlen(mb_strcut($str1, 0, strspn($str1 ^ $str2, "\0")));

132

answered Oct 01 '22 05:10

ircmaxell

Related questions
                            
                                Does $_SERVER['HTTP_X_REQUESTED_WITH'] exist in PHP or not?
                            
                                php String Concatenation, Performance
                            
                                Function return type hinting for an array of objects in PHP7
                            
                                osx 10.10 Curl POST to HTTPS url gives SSLRead() error
                            
                                Boolean value switch/invert
                            
                                SQLSTATE[42000]: Syntax error or access violation: 1064 You have an error in your SQL syntax — PHP — PDO [duplicate]
                            
                                Return a PHP page as an image
                            
                                PHP warning: Call-time pass-by-reference has been deprecated
                            
                                PHP traits - defining generic constants
                            
                                Getting Hour and Minute in PHP
                            
                                Can you store a function in a PHP array?
                            
                                What in layman's terms is a Recursive Function using PHP
                            
                                Laravel Mail::send() sending to multiple to or bcc addresses
                            
                                Debugging php-cli scripts with xdebug and netbeans?
                            
                                Getting all request parameters in Symfony 2
                            
                                SOAP request in PHP with CURL
                            
                                Get current class and method?
                            
                                Sorting a php array of arrays by custom order
                            
                                Case-insensitive array search
                            
                                mysqlworkbench giving version error on exporting database

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With