I have a list of thousands of elements of a form like the following:
pixels = ['(112, 37, 137, 255)', '(129, 39, 145, 255)', '(125, 036, 138, 255)' ...]
I am trying to convert these string elements to tuples using ast.literal_eval
, but it is breaking on encountering things like leading zeros (e.g. in the third tuple string shown) with the error SyntaxError: invalid token
.
pixels = [ast.literal_eval(pixel) for pixel in pixels]
What would be a good way to deal with things like this and get this list of strings evaluated as a list of tuples?
Method #1 : Using map() + split() + tuple() This task can be achieved using the combination of these functions. The map function can be used to link the logic to each string, split function is used to split the inner contents of list to different tuple attributes and tuple function performs the task of forming a tuple.
On each iteration, we check if the string a is contained in the current tuple and return the result. The in operator tests for membership. For example, x in t evaluates to True if x is a member of t , otherwise it evaluates to False . x not in t returns the negation of x in t .
Method #1 : Using map() + int + split() + tuple() This method can be used to solve this particular task. In this, we just split each element of string and convert to list and then we convert the list to resultant tuple.
In the majority of programming languages when you need to access a nested data type (such as arrays, lists, or tuples), you append the brackets to get to the innermost item. The first bracket gives you the location of the tuple in your list. The second bracket gives you the location of the item in the tuple.
Use re
module.
>>> import re
>>> import ast
>>> pixels = ['(112, 37, 137, 255)', '(129, 39, 145, 255)', '(125, 036, 138, 255)']
>>> [ast.literal_eval(re.sub(r'\b0+', '', pixel)) for pixel in pixels]
[(112, 37, 137, 255), (129, 39, 145, 255), (125, 36, 138, 255)]
re.sub(r'\b0+', '', pixel)
helps to remove the leading zeros. \b
matches between a word character and a non-word character or vice-versa, so here there must be an word boundary exists before zero and after the space or (
symbol.
Update:
>>> pixels = ['(0, 0, 0, 255)', '(129, 39, 145, 255)', '(125, 036, 138, 255)']
>>> [ast.literal_eval(re.sub(r'\b0+\B', '', pixel)) for pixel in pixels]
[(0, 0, 0, 255), (129, 39, 145, 255), (125, 36, 138, 255)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With