Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

php regex find substring in substring

Tags:

regex

php

I am still playing around for one project with matching words.

Let assume that I have a given string, say maxmuster . Then I want to mark this part of my random word maxs which are in maxmuster in the proper order, like the letters are.

I wil give some examples and then I tell what I already did. Lets keep the string maxmuster. The bold part is the matched one by regex (best would be in php, however could be python, bash, javascript,...)

maxs

Mymaxmuis

Lemu

muster

Of course also m, u, ... will be matched then. I know that, I am going to fix that later. However, the solution, I though, should not so difficult, so I try to divide the word in groups like this:

/(maxmuster)?|(maxmuste)?|(maxmust)?|(maxmus)?|(maxmu)?|(maxm)?|(max)?|(ma)?|(m)?/gui

But then I forgot of course the other combinations, like:

(axmuster)(xmus) and so on. Did I really have to do that, or exist there a simple regex trick, to solve this question, like I explained above?

Thank you very much

like image 456
Allan Karlson Avatar asked Oct 18 '22 13:10

Allan Karlson


1 Answers

Sounds like you need string intersection. If you don't mind non regex idea, have a look in Wikibooks Algorithm Implementation/Strings/Longest common substring PHP section.

foreach(["maxs", "Mymaxmuis", "Lemu", "muster"] AS $str)
  echo get_longest_common_subsequence($str, "maxmuster") . "\n";

max
maxmu
mu
muster

See this PHP demo at tio.run (caseless comparison).


If you need a regex idea, I would join both strings with space and use a pattern like this demo.

(?=(\w+)(?=\w* \w*?\1))\w

It will capture inside a lookahead at each position before a word character in the first string the longest substring that also matches the second string. Then by PHP matches of the first group need to be sorted by length and the longest match will be returned. See the PHP demo at tio.run.

function get_longest_common_subsequence($w1="", $w2="")
{
  $test_str = preg_quote($w1,'/')." ".preg_quote($w2,'/');

  if(preg_match_all('/(?=(\w+)(?=\w* \w*?\1))\w/i', $test_str, $out) > 0)
  {
    usort($out[1], function($a, $b) { return strlen($b) - strlen($a); });
    return $out[1][0];
  }
}
like image 57
bobble bubble Avatar answered Oct 29 '22 13:10

bobble bubble