It happens to me quite often to have a piece of code that looks like this.
raw_data = [(s.split(',')[0], s.split(',')[1]) for s in all_lines if s.split(',')[1] != '"NaN"']
Basically, I'd like to know if there is a way to create a temporary variable like splitted_s
in order to avoid having to repeat operations on the looped object (like, in this case, having to split it three times).
You can't assign a variable in a comprehension, but you can use a nested generator expression, which does what I think you want (without a lambda function). Show activity on this post. You can't do that. Assignment is always a statement in Python; list comprehensions can only contain expressions.
Using Assignment Expressions in List Comprehensions. We can also use assignment expressions in list comprehensions. List comprehensions allow you to build lists succinctly by iterating over a sequence and potentially adding elements to the list that satisfy some condition.
Using a temporary variableThe temp variables is used to store the value of the fist variable ( temp = a ). This allows you to swap the value of the two variables ( a = b ) and then assign the value of temp to the second variable.
Temp Variables are created using a “DECLARE” statement and are assigned values using either a SET or SELECT command. After declaration, all variables are initialized as NULL, unless a value is provided as part of the declaration. This acts like a variable and exists for a specific batch of query execution.
If you have two actions for processing, you may embed another list comprehension:
raw_data = [(lhs, rhs)
for lhs, rhs
in [s.split(',')[:2] for s in all_lines]
if rhs != '"NaN"']
You can use generator inside (it gives a small performance gain too):
in (s.split(',')[:2] for s in all_lines)
It will even be faster than your implementation:
import timeit
setup = '''import random, string;
all_lines = [','.join((random.choice(string.letters),
str(random.random() if random.random() > 0.3 else '"NaN"')))
for i in range(10000)]'''
oneloop = '''[(s.split(',')[0], s.split(',')[1])
for s in all_lines if s.split(',')[1] != '"NaN"']'''
twoloops = '''raw_data = [(lhs, rhs)
for lhs, rhs
in [s.split(',') for s in all_lines]
if rhs != '"NaN"']'''
timeit.timeit(oneloop, setup, number=1000) # 7.77 secs
timeit.timeit(twoloops, setup, number=1000) # 4.68 secs
Starting Python 3.8
, and the introduction of assignment expressions (PEP 572) (:=
operator), it's possible to use a local variable within a list comprehension in order to avoid calling twice the same expression:
In our case, we can name the evaluation of line.split(',')
as a variable parts
while using the result of the expression to filter the list if parts[1]
is not equal to NaN
; and thus re-use parts
to produce the mapped value:
# lines = ['1,2,3,4', '5,NaN,7,8']
[(parts[0], parts[1]) for line in lines if (parts := line.split(','))[1] != 'NaN']
# [('1', '2')]
You can't.
A list comprehension consists of brackets containing an expression followed by a for clause, then zero or more for or if clauses. The result will be a new list resulting from evaluating the expression in the context of the for and if clauses which follow it.
From here
Assignment in Python is not an expression.
As Padraic Cunningham comments - if you need to split it multiple times don't do it in list comprehension.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With