I want to create a list from the characters in a string, but keep specific keywords together.
For example:
keywords: car, bus
INPUT:
"xyzcarbusabccar"
OUTPUT:
["x", "y", "z", "car", "bus", "a", "b", "c", "car"]
To convert string to list in Python, use the string split() method. The split() is a built-in Python method that splits the strings and stores them in the list.
One of these methods uses split() function while other methods convert the string into a list without split() function. Python list has a constructor which accepts an iterable as argument and returns a list whose elements are the elements of iterable. An iterable is a structure that can be iterated.
Individual characters in a string can be accessed by specifying the string name followed by a number in square brackets ( [] ). String indexing in Python is zero-based: the first character in the string has index 0 , the next has index 1 , and so on.
Use the list() class to split a word into a list of letters, e.g. my_list = list(my_str) . The list() class will convert the string into a list of letters.
With re.findall
. Alternate between your keywords first.
>>> import re >>> s = "xyzcarbusabccar" >>> re.findall('car|bus|[a-z]', s) ['x', 'y', 'z', 'car', 'bus', 'a', 'b', 'c', 'car']
In case you have overlapping keywords, note that this solution will find the first one you encounter:
>>> s = 'abcaratab' >>> re.findall('car|rat|[a-z]', s) ['a', 'b', 'car', 'a', 't', 'a', 'b']
You can make the solution more general by substituting the [a-z]
part with whatever you like, \w
for example, or a simple .
to match any character.
Short explanation why this works and why the regex '[a-z]|car|bus'
would not work: The regular expression engine tries the alternating options from left to right and is "eager" to return a match. That means it considers the whole alternation to match as soon as one of the options has been fully matched. At this point, it will not try any of the remaining options but stop processing and report a match immediately. With '[a-z]|car|bus'
, the engine will report a match when it sees any character in the character class [a-z] and never go on to check if 'car' or 'bus' could also be matched.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With