Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get Row Position instead of Row Index from iterrows() in Pandas

I'm new to stackoverflow and I have research but have not find a satisfying answer.

I understand that I can get a row index by using df.iterrows() to iterate through a df. But what if I want to get a row position instead of row idx. What method can I use?

Example code that I'm working on is below:

df = pd.DataFrame({'month': ['Jan', 'Feb', 'March', 'April'],
               'year': [2012, 2014, 2013, 2014],
               'sale':[55, 40, 84, 31]})

df = df.set_index('month')

for idx, value in df.iterrows():
    print(idx)

How can I get an output of:

0
1
2
3

Thanks!

like image 388
learner Avatar asked May 23 '18 09:05

learner


People also ask

How do I change the position of a row in Pandas?

To move the third row to the first, you can create an index moving the target row to the first element. I use a conditional list comprehension to join by lists. Then, just use iloc to select the desired index rows. if desired, you can also reset your index.

What is the purpose of Iterrows () in Pandas?

Pandas DataFrame iterrows() Method The iterrows() method generates an iterator object of the DataFrame, allowing us to iterate each row in the DataFrame. Each iteration produces an index object and a row object (a Pandas Series object).

What is better than Iterrows?

Vectorization is always the best choice. Pandas come with df. values() function to convert the data frame to a list of list format. It took 14 seconds to iterate through a data frame with 10 million records that are around 56x times faster than iterrows().

What is index and row in Iterrows?

iterrows() is used to iterate over a pandas Data frame rows in the form of (index, series) pair. This function iterates over the data frame column, it will return a tuple with the column name and content in form of series. Syntax: DataFrame.iterrows() Yields: index- The index of the row.


1 Answers

If you need row number instead of index, you should:

  1. Use enumerate for a counter within a loop.
  2. Don't extract the index, see options below.

Option 1

In most situations, for performance reasons you should try and use df.itertuples instead of df.iterrows. You can specify index=False so that the first element is not the index.

for idx, row in enumerate(df.itertuples(index=False)):
    # do something

df.itertuples returns a namedtuple for each row.

Option 2

Use df.iterrows. This is more cumbersome, as you need to separate out an unused variable. In addition, this is inefficient vs itertuples.

for idx, (_, row) in enumerate(df.iterrows()):
    # do something
like image 90
jpp Avatar answered Oct 13 '22 10:10

jpp