Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Vectorized look-up of values in Pandas dataframe

I have two pandas dataframes one called 'orders' and another one called 'daily_prices'. daily_prices is as follows:

              AAPL    GOOG     IBM    XOM 2011-01-10  339.44  614.21  142.78  71.57 2011-01-13  342.64  616.69  143.92  73.08 2011-01-26  340.82  616.50  155.74  75.89 2011-02-02  341.29  612.00  157.93  79.46 2011-02-10  351.42  616.44  159.32  79.68 2011-03-03  356.40  609.56  158.73  82.19 2011-05-03  345.14  533.89  167.84  82.00 2011-06-03  340.42  523.08  160.97  78.19 2011-06-10  323.03  509.51  159.14  76.84 2011-08-01  393.26  606.77  176.28  76.67 2011-12-20  392.46  630.37  184.14  79.97 

orders is as follows:

           direction  size ticker  prices 2011-01-10       Buy  1500   AAPL  339.44 2011-01-13      Sell  1500   AAPL  342.64 2011-01-13       Buy  4000    IBM  143.92 2011-01-26       Buy  1000   GOOG  616.50 2011-02-02      Sell  4000    XOM   79.46 2011-02-10       Buy  4000    XOM   79.68 2011-03-03      Sell  1000   GOOG  609.56 2011-03-03      Sell  2200    IBM  158.73 2011-06-03      Sell  3300    IBM  160.97 2011-05-03       Buy  1500    IBM  167.84 2011-06-10       Buy  1200   AAPL  323.03 2011-08-01       Buy    55   GOOG  606.77 2011-08-01      Sell    55   GOOG  606.77 2011-12-20      Sell  1200   AAPL  392.46 

index of both dataframes is datetime.date. 'prices' column in the 'orders' dataframe was added by using a list comprehension to loop through all the orders and look up the specific ticker for the specific date in the 'daily_prices' data frame and then adding that list as a column to the 'orders' dataframe. I would like to do this using an array operation rather than something that loops. can it be done? i tried to use:

daily_prices.ix[dates,tickers]

but this returns a matrix of cartesian product of the two lists. i want it to return a column vector of only the price of a specified ticker for a specified date.

like image 638
luckyfool Avatar asked Dec 15 '12 14:12

luckyfool


People also ask

How do you do a Vlookup with pandas?

We can use merge() function to perform Vlookup in pandas. The merge function does the same job as the Join in SQL We can perform the merge operation with respect to table 1 or table 2. There can be different ways of merging the 2 tables.

How do you look up a DataFrame in Python?

DataFrame - lookup() functionThe lookup() function returns label-based "fancy indexing" function for DataFrame. Given equal-length arrays of row and column labels, return an array of the values corresponding to each (row, col) pair. Download the Pandas DataFrame Notebooks from here.

What does .values do in pandas?

The values property is used to get a Numpy representation of the DataFrame. Only the values in the DataFrame will be returned, the axes labels will be removed. The values of the DataFrame. A DataFrame where all columns are the same type (e.g., int64) results in an array of the same type.

Which function is used to find values from a DataFrame D using the index number?

get_value() function is used to quickly retrieve the single value in the data frame at the passed column and index. The input to the function is the row label and the column label.


1 Answers

Use our friend lookup, designed precisely for this purpose:

In [17]: prices Out[17]:                AAPL    GOOG     IBM    XOM 2011-01-10  339.44  614.21  142.78  71.57 2011-01-13  342.64  616.69  143.92  73.08 2011-01-26  340.82  616.50  155.74  75.89 2011-02-02  341.29  612.00  157.93  79.46 2011-02-10  351.42  616.44  159.32  79.68 2011-03-03  356.40  609.56  158.73  82.19 2011-05-03  345.14  533.89  167.84  82.00 2011-06-03  340.42  523.08  160.97  78.19 2011-06-10  323.03  509.51  159.14  76.84 2011-08-01  393.26  606.77  176.28  76.67 2011-12-20  392.46  630.37  184.14  79.97  In [18]: orders Out[18]:                    Date direction  size ticker  prices 0  2011-01-10 00:00:00       Buy  1500   AAPL  339.44 1  2011-01-13 00:00:00      Sell  1500   AAPL  342.64 2  2011-01-13 00:00:00       Buy  4000    IBM  143.92 3  2011-01-26 00:00:00       Buy  1000   GOOG  616.50 4  2011-02-02 00:00:00      Sell  4000    XOM   79.46 5  2011-02-10 00:00:00       Buy  4000    XOM   79.68 6  2011-03-03 00:00:00      Sell  1000   GOOG  609.56 7  2011-03-03 00:00:00      Sell  2200    IBM  158.73 8  2011-06-03 00:00:00      Sell  3300    IBM  160.97 9  2011-05-03 00:00:00       Buy  1500    IBM  167.84 10 2011-06-10 00:00:00       Buy  1200   AAPL  323.03 11 2011-08-01 00:00:00       Buy    55   GOOG  606.77 12 2011-08-01 00:00:00      Sell    55   GOOG  606.77 13 2011-12-20 00:00:00      Sell  1200   AAPL  392.46  In [19]: prices.lookup(orders.Date, orders.ticker) Out[19]:  array([ 339.44,  342.64,  143.92,  616.5 ,   79.46,   79.68,  609.56,         158.73,  160.97,  167.84,  323.03,  606.77,  606.77,  392.46]) 
like image 129
Wes McKinney Avatar answered Oct 10 '22 03:10

Wes McKinney