One simple way to iterate over columns of pandas DataFrame is by using for loop. You can use column-labels to run the for loop over the pandas DataFrame using the get item syntax ([]) . Yields below output. The values() function is used to extract the object elements as a list.
Vectorization is always the first and best choice. You can convert the data frame to NumPy array or into dictionary format to speed up the iteration workflow. Iterating through the key-value pair of dictionaries comes out to be the fastest way with around 280x times speed up for 20 million records.
Dataframe class provides a member function iteritems() which gives an iterator that can be utilized to iterate over all the columns of a data frame. For every column in the Dataframe it returns an iterator to the tuple containing the column name and its contents as series. Code : Python3.
DataFrame. iterrows() method is used to iterate over DataFrame rows as (index, Series) pairs. Note that this method does not preserve the dtypes across rows due to the fact that this method will convert each row into a Series .
for column in df:
print(df[column])
You can use iteritems()
:
for name, values in df.iteritems():
print('{name}: {value}'.format(name=name, value=values[0]))
This answer is to iterate over selected columns as well as all columns in a DF.
df.columns
gives a list containing all the columns' names in the DF. Now that isn't very helpful if you want to iterate over all the columns. But it comes in handy when you want to iterate over columns of your choosing only.
We can use Python's list slicing easily to slice df.columns according to our needs. For eg, to iterate over all columns but the first one, we can do:
for column in df.columns[1:]:
print(df[column])
Similarly to iterate over all the columns in reversed order, we can do:
for column in df.columns[::-1]:
print(df[column])
We can iterate over all the columns in a lot of cool ways using this technique. Also remember that you can get the indices of all columns easily using:
for ind, column in enumerate(df.columns):
print(ind, column)
You can index dataframe columns by the position using ix
.
df1.ix[:,1]
This returns the first column for example. (0 would be the index)
df1.ix[0,]
This returns the first row.
df1.ix[:,1]
This would be the value at the intersection of row 0 and column 1:
df1.ix[0,1]
and so on. So you can enumerate()
returns.keys():
and use the number to index the dataframe.
A workaround is to transpose the DataFrame
and iterate over the rows.
for column_name, column in df.transpose().iterrows():
print column_name
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With