Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to iterate over rows in a DataFrame in Pandas

I have a DataFrame from Pandas:

import pandas as pd inp = [{'c1':10, 'c2':100}, {'c1':11,'c2':110}, {'c1':12,'c2':120}] df = pd.DataFrame(inp) print df 

Output:

   c1   c2 0  10  100 1  11  110 2  12  120 

Now I want to iterate over the rows of this frame. For every row I want to be able to access its elements (values in cells) by the name of the columns. For example:

for row in df.rows:    print row['c1'], row['c2'] 

Is it possible to do that in Pandas?

I found this similar question. But it does not give me the answer I need. For example, it is suggested there to use:

for date, row in df.T.iteritems(): 

or

for row in df.iterrows(): 

But I do not understand what the row object is and how I can work with it.

like image 875
Roman Avatar asked May 10 '13 07:05

Roman


People also ask

How do I iterate over rows in pandas DataFrame?

DataFrame. iterrows() method is used to iterate over DataFrame rows as (index, Series) pairs. Note that this method does not preserve the dtypes across rows due to the fact that this method will convert each row into a Series .

What is the fastest way to iterate over pandas DataFrame?

Vectorization is always the first and best choice. You can convert the data frame to NumPy array or into dictionary format to speed up the iteration workflow. Iterating through the key-value pair of dictionaries comes out to be the fastest way with around 280x times speed up for 20 million records.

How do I iterate over a DataFrame column in Python?

One simple way to iterate over columns of pandas DataFrame is by using for loop. You can use column-labels to run the for loop over the pandas DataFrame using the get item syntax ([]) . Yields below output. The values() function is used to extract the object elements as a list.


1 Answers

DataFrame.iterrows is a generator which yields both the index and row (as a Series):

import pandas as pd  df = pd.DataFrame({'c1': [10, 11, 12], 'c2': [100, 110, 120]}) df = df.reset_index()  # make sure indexes pair with number of rows for index, row in df.iterrows():     print(row['c1'], row['c2']) 
10 100 11 110 12 120 
like image 145
waitingkuo Avatar answered Sep 21 '22 06:09

waitingkuo