Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract substrings in python

I want to parse a string to extract all the substrings in curly braces:

'The value of x is {x}, and the list is {y} of len {}'

should produce:

(x, y)

Then I want to format the string to print the initial string with the values:

str.format('The value of x is {x}, and the list is {y} of len {}', x, y, len(y))

How can I do that?

Example usage:
def somefunc():
    x = 123
    y = ['a', 'b']
    MyFormat('The value of x is {x}, and the list is {y} of len {}',len(y))

output:
    The value of x is 123, and the list is ['a', 'b'] of len 2
like image 209
Alex Gill Avatar asked Jun 18 '15 11:06

Alex Gill


People also ask

How do I extract a specific word from a string in Python?

Using the split function, we can split the string into a list of words and this is the most generic and recommended method if one wished to accomplish this particular task. But the drawback is that it fails in cases the string contains punctuation marks.

How can you extract a substring from a given string?

You can extract a substring from a String using the substring() method of the String class to this method you need to pass the start and end indexes of the required substring.

How do I extract a string before a character in Python?

Use str. partition() to get the part of a string before the first occurrence of a specific character. Call str. partition(sep) with sep as the desired character to get a tuple containing three items: everything before the first occurrence of sep in str , sep , and the rest of str , in that order.


1 Answers

You can use string.Formatter.parse:

Loop over the format_string and return an iterable of tuples (literal_text, field_name, format_spec, conversion). This is used by vformat() to break the string into either literal text, or replacement fields.

The values in the tuple conceptually represent a span of literal text followed by a single replacement field. If there is no literal text (which can happen if two replacement fields occur consecutively), then literal_text will be a zero-length string. If there is no replacement field, then the values of field_name, format_spec and conversion will be None.

from string import Formatter

s = 'The value of x is {x}, and the list is {y} of len {}'

print([t[1] for t in Formatter().parse(s) if t[1]])
['x', 'y']

Not sure how that really helps what you are trying to do as you can just pass x and y to str.format in your function or use **locals:

def somefunc():
    x = 123
    y = ['a', 'b']
    print('The value of x is {x}, and the list is {y} of len {}'.format(len(y),**locals()))

If you wanted to print the named args you could add the Formatter output:

def somefunc():
    x = 123
    y = ['a', 'b']
    print("The named args are {}".format( [t[1] for t in Formatter().parse(s) if t[1]]))
    print('The value of x is {x}, and the list is {y} of len {}'.format(len(y), **locals()))

Which would output:

The named args are ['x', 'y']
The value of x is 123, and the list is ['a', 'b'] of len 2
like image 80
Padraic Cunningham Avatar answered Nov 15 '22 12:11

Padraic Cunningham