Pandas dataframe.query method syntax





I would like to gain a better understanding of the Pandas DataFrame.query method and what the following expression represents:

match = dfDays.query('index > @x.name & price >= @x.target')

What does @x.name represent?

I understand what the resulting output is for this code (a new column with pandas.tslib.Timestamp data) but don't have a clear understanding of the expression used to get this end result.


From here:

Vectorised way to query date and price data

rng = pd.date_range('1/1/2000', '2000-07-31',freq='D')
weeks = np.random.uniform(low=1.03, high=3, size=(len(rng),))
ts2 = pd.Series(weeks
dfDays = pd.DataFrame({'price':ts2})
dfWeeks = dfDays.resample('1W-Mon').first()
dfWeeks['target'] = (dfWeeks['price'] + .5).round(2)

def find_match(x):
    match = dfDays.query('index > @x.name & price >= @x.target')
    if not match.empty:
        return match.index[0]

dfWeeks.assign(target_hit=dfWeeks.apply(find_match, 1))
1 Answers

@x.name - @ helps .query() to understand that x is an external object (doesn't belong to the DataFrame for which the query() method was called). In this case x is a DataFrame. It could be a scalar value as well.

I hope this small demonstration will help you to understand it:

In [79]: d1
   a  b  c
0  1  2  3
1  4  5  6
2  7  8  9

In [80]: d2
   a   x
0  1  10
1  7  11

In [81]: d1.query("a in @d2.a")
   a  b  c
0  1  2  3
2  7  8  9

In [82]: d1.query("c < @d2.a")
   a  b  c
1  4  5  6

Scalar x:

In [83]: x = 9

In [84]: d1.query("c == @x")
   a  b  c
2  7  8  9
