Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split string by known substrings

I've got a list of different strings (this is an example):

strs = ["FOOBAR", "PYTHON", "MAPARTS"]

and I've got another list with substrings that one of the strings in the previous list might contain:

substrs = ["ARTS", "FOO", "PY", "BAR", "MAP"]

I want to make a list that has all the strings in strs that can be split using two strings in substrs, split by them and wrapped in a list or tuple. So the finished list would look like:

[("FOO", "BAR"), ("MAP", "ARTS")]

I can't wrap my head around how to manage it, at least in a simple way. Any help?

like image 880
maromalo Avatar asked Jan 25 '23 09:01

maromalo


1 Answers

Here is an interesting approach if you want to check if any combination of two tokens is a word listed in words:

from itertools import product

words = ["FOOBAR", "PYTHON", "MAPARTS"]

tokens = ["ARTS", "FOO", "PY", "BAR", "MAP"]

pairs = [_ for _ in product(tokens, tokens) if ''.join(_) in words]

Resulting in:

>>> pairs
[('FOO', 'BAR'), ('MAP', 'ARTS')]
like image 154
accdias Avatar answered Feb 03 '23 11:02

accdias