Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select next N rows in pandas dataframe using iterrows

Tags:

python

pandas

I need to select each time N rows in a pandas Dataframe using iterrows. Something like this:

def func():
    selected = []
    for i in range(N):
        selected.append(next(dataframe.iterrows()))

    yield selected

But doing this selected has N equal elements. And each time I call func I have always the same result (the first element of the dataframe).

If the dataframe is:

   A  B  C
0  5  8  2
1  1  2  3
2  4  5  6
3  7  8  9
4  0  1  2
5  3  4  5
6  7  8  6
7  1  2  3

What I want to obtain is:

N = 3
selected = [ [5,8,2], [1,2,3], [4,5,6] ] 
then, calling again the function,
selected = [ [7,8,9], [0,1,2], [3,4,5] ] 
then,
selected = [ [7,8,6], [1,2,3], [5,8,2] ] 
like image 932
solopiu Avatar asked Mar 07 '26 17:03

solopiu


1 Answers

No need for .iterrows(), rather use slicing:

def flow_from_df(dataframe: pd.DataFrame, chunk_size: int = 10):
    for start_row in range(0, dataframe.shape[0], chunk_size):
        end_row  = min(start_row + chunk_size, dataframe.shape[0])
        yield dataframe.iloc[start_row:end_row, :]

To use it:

get_chunk = flow_from_df(dataframe)
chunk1 = next(get_chunk)
chunk2 = next(get_chunk)

Or not using a generator:

def get_chunk(dataframe: pd.DataFrame, chunk_size: int, start_row: int = 0) -> pd.DataFrame:
    end_row  = min(start_row + chunk_size, dataframe.shape[0])

    return dataframe.iloc[start_row:end_row, :]
like image 143
Dan Avatar answered Mar 09 '26 05:03

Dan