Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I label data based on the values of the previous row?

Tags:

python

pandas

I want to label the data "1" if the current value is higher than that of the previous row and "0" otherwise.

Lets say I have this DataFrame:

df = pd.DataFrame({'date': [1,2,3,4,5], 'price': [50.125, 45.25, 65.857, 100.956, 77.4152]})

and I want the output as if the DataFrame is constructed like this:

df = pd.DataFrame({'date': [1,2,3,4,5], 'price': [50.125, 45.25, 65.857, 100.956, 77.4152], 'label':[0, 0, 1, 1, 0]})

*I don't know how to post a DataFrame

These code is my attempt:

df['label'] = 0
i = 0
for price in df['price']:
    i = i+1
    if price in i > price: #---> right now I am stuck here. i=It says argument of type 'int' is not iterable
        df.append['label', 1]
    elif price in i <= price:
        df.append['label', 0]

I think there are also other logical mistakes in my codes. What am I doing wrong?

like image 499
SmilingSwordman Avatar asked Jan 01 '23 01:01

SmilingSwordman


1 Answers

Create boolean mask by Series.ge (>=) with Series.shift and convert to integers for True/False to 1/0 mapping by Series.view:

df['label'] = df['price'].ge(df['price'].shift()).view('i1')

Or by Series.astype:

df['label'] = df['price'].ge(df['price'].shift()).astype(int)
like image 67
jezrael Avatar answered Jan 02 '23 13:01

jezrael