Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas iterrows() with previous values

I have a pandas Dataframe in the form:

            A           B       K      S
2012-03-31  NaN         NaN     NaN    10
2012-04-30  62.74449    15.2    71.64   0
2012-05-31  2029.487    168.8   71.64   0
2012-06-30  170.7191    30.4    71.64   0

I trying to create a function that replace df['S'] using df['S'][index-1] value.

for example:

for index,row in df.iterrows:
     if index = 1: 
         pass
     else:
         df['S'] = min(df['A'] + df['S'][index-1]?? - df['B'], df['K'])

but i dont know how to get df['S'][index - 1]

like image 541
Stavros Anastasiadis Avatar asked Aug 24 '14 15:08

Stavros Anastasiadis


People also ask

What is better than Iterrows?

Itertuples() iterates through the data frame by converting each row of data as a list of tuples. itertuples() takes 16 seconds to iterate through a data frame with 10 million records that are around 50x times faster than iterrows().

How do I use Iterrows in pandas DataFrame?

The iterrows() method generates an iterator object of the DataFrame, allowing us to iterate each row in the DataFrame. Each iteration produces an index object and a row object (a Pandas Series object).

What is faster than Iterrows?

While slower than apply , itertuples is quicker than iterrows , so if looping is required, try implementing itertuples instead. Using map as a vectorized solution gives even faster results.

What does Iterrows return in pandas?

iterrows() is used to iterate over a pandas Data frame rows in the form of (index, series) pair. This function iterates over the data frame column, it will return a tuple with the column name and content in form of series.


1 Answers

It looks like your initial answer is pretty close.

The following should work:

for index, row in df.iterrows():
    if df.loc[index, 'S'] != 0:
        df.loc[index, 'S'] = df.loc[str(int(index) - 1), 'S']

Essentially, for all but the first index, i.e. 0, change the value in the 'S' column to the value in the row before it. Note: This assumes a dataframe with a sequential, ordered index.

The iterrows() method doesn't let you modify the values by calling the row on its own, hence you need to use df.loc() to identify the cell in the dataframe and then change it's value.

Also worth noting that index is not an integer, hence the the use of the int() function to subtract 1. This is all within the str() function so that the final index output is a string, as expected.

like image 161
Fab Dot Avatar answered Nov 05 '22 18:11

Fab Dot