Pandas Iterrows Row Number & Percentage

Tags:

2 Answers

First of all iterrows gives tuples of (index, row). So the proper code is

for index, row in testDF.iterrows():

Index in general case is not a number of row, it is some identifier (this is a power of pandas, but it makes some confusions as it behaves not as ordinary list in python where the index is the number of row). That is why we need to calculate the number of rows independently. We can introduce line_number = 0 and increase it in each cirlce line_number += 1. But python gives us a ready tool for that: enumerate, which returns tuples of (line_number, value) instead of just value. So we come down to that code

for line_number, (index, row) in enumerate(testDF.iterrows()):
    print("Currently on row: {}; Currently iterated {}% of rows".format(
          line_number, 100*(line_number + 1)/len(testDF)))

P.S. python2 returns integer when you divide integers, that is why 999/1000 = 0, what you don't expect. So you can either force float or take 100* to the beginning to get integer percent.

answered Sep 19 '22 23:09

Leonid Mednikov

One possible solution with format if unique monotonic index (0,1,2,...):

for i, row in testDF.iterrows():
        print("Currently on row: {}; Currently iterrated {}% of rows".format(i, (i + 1)/len(testDF.index) * 100))

Sample:

np.random.seed(1332)
testDF = pd.DataFrame(np.random.randint(10, size=(10, 3)))
print (testDF)
   0  1  2
0  8  1  9
1  4  3  5
2  0  1  3
3  1  8  6
4  7  4  7
5  7  5  3
6  7  9  9
7  0  1  2
8  1  3  4
9  0  0  3

for i, row in testDF.iterrows():
        print("Currently on row: {}; Currently iterrated {}% of rows".format(i, (i + 1)/len(testDF.index) * 100))
Currently on row: 0; Currently iterrated 10.0% of rows
Currently on row: 1; Currently iterrated 20.0% of rows
Currently on row: 2; Currently iterrated 30.0% of rows
Currently on row: 3; Currently iterrated 40.0% of rows
Currently on row: 4; Currently iterrated 50.0% of rows
Currently on row: 5; Currently iterrated 60.0% of rows
Currently on row: 6; Currently iterrated 70.0% of rows
Currently on row: 7; Currently iterrated 80.0% of rows
Currently on row: 8; Currently iterrated 90.0% of rows
Currently on row: 9; Currently iterrated 100.0% of rows

EDIT:

If some custom index values, solution with zip and numpy.arange by length of index what is same of length of df:

np.random.seed(1332)
testDF = pd.DataFrame(np.random.randint(10, size=(10, 3)), index=[2,4,5,6,7,8,2,1,3,5])
print (testDF)
   0  1  2
2  8  1  9
4  4  3  5
5  0  1  3
6  1  8  6
7  7  4  7
8  7  5  3
2  7  9  9
1  0  1  2
3  1  3  4
5  0  0  3

for i, (idx, row) in zip(np.arange(len(testDF.index)), testDF.iterrows()):
    print("Currently on row: {}; Currently iterrated {}% of rows".format(idx, (i + 1)/len(testDF.index) * 100))

Currently on row: 2; Currently iterrated 10.0% of rows
Currently on row: 4; Currently iterrated 20.0% of rows
Currently on row: 5; Currently iterrated 30.0% of rows
Currently on row: 6; Currently iterrated 40.0% of rows
Currently on row: 7; Currently iterrated 50.0% of rows
Currently on row: 8; Currently iterrated 60.0% of rows
Currently on row: 2; Currently iterrated 70.0% of rows
Currently on row: 1; Currently iterrated 80.0% of rows
Currently on row: 3; Currently iterrated 90.0% of rows
Currently on row: 5; Currently iterrated 100.0% of rows

answered Sep 19 '22 23:09

jezrael

Related questions
                            
                                Getting all the nodes from Python AST that correspond to a particular variable with a given name
                            
                                How to append multi dimensional array using for loop in python
                            
                                Bar chart pandas Dataframe with Bokeh
                            
                                Pandas: Count Distinct Combinations of two columns and add to Same Dataframe
                            
                                Matplotlib: Move x-axis tick labels one position to left
                            
                                Tensorflow model import to Java
                            
                                How do I rename an index row in Python Pandas? [duplicate]
                            
                                Error: pandas hashtable keyerror
                            
                                Pylint does not work in visual studio code
                            
                                How to convert black and white image to array with 3 dimensions in python?
                            
                                error while trying to install cassandra-driver using python
                            
                                Getting percentages in legend from pie matplotlib pie chart
                            
                                sklearn standardscaler result different to manual result
                            
                                Python Machine Learning Functions [closed]
                            
                                python csv to dictionary using csv or pandas module
                            
                                Pandas groupby with delimiter join
                            
                                How to define a function inside a loop [duplicate]
                            
                                Safely unpacking results of str.split [duplicate]
                            
                                Run all tests from subdirectories in Python
                            
                                python docopt: "expected string or buffer"

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas Iterrows Row Number & Percentage

Tags:

python

pandas

christaylor

People also ask

2 Answers

Leonid Mednikov

jezrael

Recent Activity

Donate For Us