Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas Query with Variable as Column Name

Tags:

python

pandas

Passing on new application of information I learned that was part of another question: Unable to query a local variable in pandas 0.14.0

Credit and thanks to user @choldgraf. I'm applying his answer from the above link differently.

Objective: To use a variable as the column name in a query

Failed examples:

import pandas as pd
fooframe = pd.DataFrame({'Size':['Large', 'Medium', 'Small', 'Tiny'], 'Color':[1, 2, 3, 4]})
myvar = 'Size'
subframe = fooframe.query("myvar == 'Large'")

The code above returns a key error for 'myvar'.

import pandas as pd
fooframe = pd.DataFrame({'Size':['Large', 'Medium', 'Small', 'Tiny'], 'Color':[1, 2, 3, 4]})
myvar = 'Size'
subframe = fooframe.query("@myvar == 'Large'")

The code above adds "@" before myvar in the query to reference myvar as a local variable. However, the code still returns an error.

like image 415
TempleGuard527 Avatar asked Apr 11 '18 17:04

TempleGuard527


People also ask

How do you use variables in pandas?

In order to do reference of a variable in query, you need to use @ . Instead of filter value we are referring the column which we want to use for subetting or filtering. {0} takes a value of variable myvar1. Incase you want to pass multiple columns as variables in query.

Is query faster than LOC?

The query function seams more efficient than the loc function. DF2: 2K records x 6 columns. The loc function seams much more efficient than the query function.


1 Answers

Credit and thanks to user @choldgraf. I used the technique he mentioned in another post (Unable to query a local variable in pandas 0.14.0) not for the value in the column but for the column name.

A variable can be used as the column name in a pandas query by inserting it into the query string like so:

import pandas as pd
fooframe = pd.DataFrame({'Size':['Large', 'Medium', 'Small', 'Tiny'], 'Color':[1, 2, 3, 4]})
myvar = 'Size'
subframe = fooframe.query("`{0}` == 'Large'".format(myvar))

(Where backticks are used to bracket the column name, dealing with special characters and spaces in column names.)

like image 116
TempleGuard527 Avatar answered Sep 22 '22 00:09

TempleGuard527