i have a python string in the format:
str = "name: srek age :24 description: blah blah"
is there any way to convert it to dictionary that looks like
{'name': 'srek', 'age': '24', 'description': 'blah blah'}
where each entries are (key,value) pairs taken from string. I tried splitting the string to list by
str.split()
and then manually removing :
, checking each tag name, adding to a dictionary. The drawback of this method is: this method is nasty, I have to manually remove :
for each pair and if there is multi word 'value' in string (for example, blah blah
for description
), each word will be a separate entry in a list which is not desirable. Is there any Pythonic way of getting the dictionary (using python 2.7) ?
without re
:
r = "name: srek age :24 description: blah blah cat: dog stack:overflow"
lis=r.split(':')
dic={}
try :
for i,x in enumerate(reversed(lis)):
i+=1
slast=lis[-(i+1)]
slast=slast.split()
dic[slast[-1]]=x
lis[-(i+1)]=" ".join(slast[:-1])
except IndexError:pass
print(dic)
{'age': '24', 'description': 'blah blah', 'stack': 'overflow', 'name': 'srek', 'cat': 'dog'}
>>> r = "name: srek age :24 description: blah blah"
>>> import re
>>> regex = re.compile(r"\b(\w+)\s*:\s*([^:]*)(?=\s+\w+\s*:|$)")
>>> d = dict(regex.findall(r))
>>> d
{'age': '24', 'name': 'srek', 'description': 'blah blah'}
Explanation:
\b # Start at a word boundary
(\w+) # Match and capture a single word (1+ alnum characters)
\s*:\s* # Match a colon, optionally surrounded by whitespace
([^:]*) # Match any number of non-colon characters
(?= # Make sure that we stop when the following can be matched:
\s+\w+\s*: # the next dictionary key
| # or
$ # the end of the string
) # End of lookahead
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With