I am attempting to create a graph by querying values in a pandas DataFrame.
In this line:
data1 = [np.array(df.query('type == i')['continuous']
         for i in ('Type1', 'Type2', 'Type3', 'Type4')]
I get the error:
UndefinedVariableError: name 'i' is not defined
What am I missing?
The i in your query expression
df.query('type == i')
is literally just the string 'i'. Since there are no extra enclosing quotes around it, pandas interprets it as the name of another column in your DataFrame, i.e. it looks for cases where 
df['type'] == df['i']
Since there is no i column, you get an UndefinedVariableError.
It looks like you intended to query where the values in the type column are equal to the string variable named i, i.e. where
df['type'] == 'Type1'
df['type'] == 'Type2' # etc.
In this case you need to actually insert the string i into the query expression:
df.query('type == "%s"' % i)
The extra set of quotes are necessary if 'Type1', 'Type2' etc. are values within the type column, but not if they are the names of other columns in the dataframe.
I know too late but maybe it helps somebody - use double quotes for i 
data1 = [np.array(df.query('type == "i"')['continuous']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With