Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Iterating over each element in pandas DataFrame

So I got a pandas DataFrame with a single column and a lot of data.

I need to access each of the element, not to change it (with apply()) but to parse it into another function.

When looping through the DataFrame it always stops after the first one.

If I convert it to a list before, then my numbers are all in braces (eg. [12] instead of 12) thus breaking my code.

Does anyone see what I am doing wrong?

import pandas as pd

def go_trough_list(df):
  for number in df:
    print(number)

df = pd.read_csv("my_ids.csv")
go_trough_list(df)

df looks like:

   1
0  2
1  3
2  4
dtype: object
[Finished in 1.1s]

Edit: I found one mistake. My first value is recognized as a header. So I changed my code to:

df = pd.read_csv("my_ids.csv",header=None)

But with

for ix in df.index:
    print(df.loc[ix])

I get:

0    1
Name: 0, dtype: int64
0    2
Name: 1, dtype: int64
0    3
Name: 2, dtype: int64
0    4
Name: 3, dtype: int64

edit: Here is my Solution thanks to jezrael and Nick!

First I added headings=None because my data has no header. Then I changed my function to:

def go_through_list(df)
    new_list = df[0].apply(my_function,parameter=par1)
    return new_list

And it works perfectly! Thank you again guys, problem solved.

like image 546
Ali Avatar asked Mar 02 '16 21:03

Ali


People also ask

How do I iterate through all elements in a pandas DataFrame?

In order to iterate over rows, we apply a function itertuples() this function return a tuple for each row in the DataFrame. The first element of the tuple will be the row's corresponding index value, while the remaining values are the row values.

How do I iterate over a pandas DataFrame column?

Iterate Over DataFrame Columns One simple way to iterate over columns of pandas DataFrame is by using for loop. You can use column-labels to run the for loop over the pandas DataFrame using the get item syntax ([]) . Yields below output. The values() function is used to extract the object elements as a list.

How do I iterate over pandas DataFrame index?

Using DataFrame. pandas DataFrame. iterrows() is used to iterate over DataFrame rows. This returns (index, Series) where the index is an index of the Row and Series is data or content of each row. To get the data from the series, you should use the column name like row["Fee"] .

How do you use Iterrows in pandas?

Pandas DataFrame iterrows() Method The iterrows() method generates an iterator object of the DataFrame, allowing us to iterate each row in the DataFrame. Each iteration produces an index object and a row object (a Pandas Series object).


2 Answers

You can use the index as in other answers, and also iterate through the df and access the row like this:

for index, row in df.iterrows():
    print(row['column'])

however, I suggest solving the problem differently if performance is of any concern. Also, if there is only one column, it is more correct to use a Pandas Series.

What do you mean by parse it into another function? Perhaps take the value, and do something to it and create it into another column?

I need to access each of the element, not to change it (with apply()) but to parse it into another function.

Perhaps this example will help:

import pandas as pd
df = pd.DataFrame([20, 21, 12])
def square(x):
    return x**2
df['new_col'] = df[0].apply(square)  # can use a lambda here nicely
like image 118
Nick Brady Avatar answered Sep 28 '22 11:09

Nick Brady


You can convert column as Series tolist:

for x in df['Colname'].tolist():
    print x

Sample:

import pandas as pd

df = pd.DataFrame({'a': pd.Series( [1, 2, 3]),
                   'b': pd.Series( [4, 5, 6])})
print df
   a  b
0  1  4
1  2  5
2  3  6

for x in df['a'].tolist():
    print x
    1
    2
    3

If you have only one column, use iloc for selecting first column:

for x in df.iloc[:,0].tolist():
    print x

Sample:

import pandas as pd

df = pd.DataFrame({1: pd.Series( [2, 3, 4])})
print df
   1
0  2
1  3
2  4

for x in df.iloc[:,0].tolist():
    print x
    2
    3
    4  

This can work too, but it is not recommended approach, because 1 can be number or string and it can raise Key error:

for x in df[1].tolist():
    print x
2
3
4
like image 36
jezrael Avatar answered Oct 01 '22 11:10

jezrael