Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select columns in a DataFrame conditional on row

I am attempting to generate a dataframe (or series) based on another dataframe, selecting a different column from the first frame dependent on the row using another series. In the below simplified example, I want the frame1 values from 'a' for the first three rows, and 'b for the final two (the picked_values series).

frame1=pd.DataFrame(np.random.randn(10).reshape(5,2),index=range(5),columns=['a','b'])
picked_values=pd.Series(['a','a','a','b','b'])

Frame1

    a           b
0   0.283519    1.462209
1   -0.352342   1.254098
2   0.731701    0.236017
3   0.022217    -1.469342
4   0.386000    -0.706614

Trying to get to the series:

0   0.283519
1   -0.352342
2   0.731701
3   -1.469342
4   -0.706614

I was hoping values[picked_values] would work, but this ends up with five columns.

In the real-life example, picked_values is a lot larger and calculated.

Thank you for your time.

like image 660
TheSuperbard Avatar asked Jan 24 '20 14:01

TheSuperbard


People also ask

How do I select selective columns in pandas?

You can use the filter function of the pandas dataframe to select columns containing a specified string in column names. The parameter like of the . filter function defines this specific string. If a column name contains the string specified, that column will be selected and dataframe will be returned.

How do you select rows of pandas DataFrame based on values in a list?

isin() to Select Rows From List of Values. DataFrame. isin() method is used to filter/select rows from a list of values. You can have the list of values in variable and use it on isin() or use it directly.


2 Answers

Use df.lookup

pd.Series(frame1.lookup(picked_values.index,picked_values))

0    0.283519
1   -0.352342
2    0.731701
3   -1.469342
4   -0.706614
dtype: float64
like image 61
anky Avatar answered Sep 19 '22 01:09

anky


Here's a NumPy based approach using integer indexing and Series.searchsorted:

frame1.values[frame1.index, frame1.columns.searchsorted(picked_values.values)]
# array([0.22095278, 0.86200616, 1.88047197, 0.49816937, 0.10962954])
like image 34
yatu Avatar answered Sep 18 '22 01:09

yatu