Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find smallest substring containing a given set of letters in a larger string

Say you have the following string:

FJKAUNOJDCUTCRHBYDLXKEODVBWTYPTSHASQQFCPRMLDXIJMYPVOHBDUGSMBLMVUMMZYHULSUIZIMZTICQORLNTOVKVAMQTKHVRIFMNTSLYGHEHFAHWWATLYAPEXTHEPKJUGDVWUDDPRQLUZMSZOJPSIKAIHLTONYXAULECXXKWFQOIKELWOHRVRUCXIAASKHMWTMAJEWGEESLWRTQKVHRRCDYXNT
LDSUPXMQTQDFAQAPYBGXPOLOCLFQNGNKPKOBHZWHRXAWAWJKMTJSLDLNHMUGVVOPSAMRUJEYUOBPFNEHPZZCLPNZKWMTCXERPZRFKSXVEZTYCXFRHRGEITWHRRYPWSVAYBUHCERJXDCYAVICPTNBGIODLYLMEYLISEYNXNMCDPJJRCTLYNFMJZQNCLAGHUDVLYIGASGXSZYPZKLAWQUDVNTWGFFY
FFSMQWUNUPZRJMTHACFELGHDZEJWFDWVPYOZEVEJKQWHQAHOCIYWGVLPSHFESCGEUCJGYLGDWPIWIDWZZXRUFXERABQJOXZALQOCSAYBRHXQQGUDADYSORTYZQPWGMBLNAQOFODSNXSZFURUNPMZGHTAJUJROIGMRKIZHSFUSKIZJJTLGOEEPBMIXISDHOAIFNFEKKSLEXSJLSGLCYYFEQBKIZZTQQ
XBQZAPXAAIFQEIXELQEZGFEPCKFPGXULLAHXTSRXDEMKFKABUTAABSLNQBNMXNEPODPGAORYJXCHCGKECLJVRBPRLHORREEIZOBSHDSCETTTNFTSMQPQIJBLKNZDMXOTRBNMTKHHCZQQMSLOAXJQKRHDGZVGITHYGVDXRTVBJEAHYBYRYKJAVXPOKHFFMEPHAGFOOPFNKQAUGYLVPWUJUPCUGGIXGR
AMELUTEPYILBIUOCKKUUBJROQFTXMZRLXBAMHSDTEKRRIKZUFNLGTQAEUINMBPYTWXULQNIIRXHHGQDPENXAJNWXULFBNKBRINUMTRBFWBYVNKNKDFR

I'm trying to find the smallest substring containing the letters ABCDA.

I tried a regex approach.

console.log(str.match(/[A].*?[B].*?[C].*?[D].*?[A]/gm).sort((a, b) => a.length - b.length)[0]);

This works, but it only find strings where ABCDA appear (in that order). Meaning it won't find substring where the letters appear in a order like this: BCDAA

I'm trying to change my regex to account for this. How would I do that without using | and type out all the different cases?

like image 728
Nilzone- Avatar asked Dec 20 '15 12:12

Nilzone-


People also ask

How do you check if a smaller string occurs within a bigger string?

This can be done by running a nested loop traversing the given string and in that loop running another loop checking for sub-strings starting from every index.

How do you find a specific substring in a string?

To locate a substring in a string, use the indexOf() method. Let's say the following is our string. String str = "testdemo"; Find a substring 'demo' in a string and get the index.


1 Answers

You can't.

Let's consider a special case: Assume the letters you are looking for are A, A, and B. At some point in your regexp there will certainly be a B. However, the parts to the left and to the right of the B are independent of each other, so you cannot refer from one to the other. How many As are matched in the subexpression to the right of the B depends on the number of As being already matched in the left part. This is not possible with regular expressions, so you will have to unfold all the different orders, which can be many!

Another popular example that illustrates the problem is to match opening brackets with closing brackets. It's not possible to write a regular expression asserting that in a given string a sequence of opening brackets is followed by a sequence of closing brackets of the same length. The reason for this is that to count the brackets you would need a stack machine in contrast to a finite state machine but regular expressions are limited to patterns that can be matched using FSMs.

like image 159
lex82 Avatar answered Sep 21 '22 16:09

lex82