I have a dataframe that contains 13 different column names, I have separated these headings into two lists. I now want to perform different operations on each of these lists.
Is it possible to pass column names into pandas as a variable? My code at the moment can loop through the list fine but i am having trouble trying to pass the column name into the function
Code
CONT = ['age','fnlwgt','capital-gain','capital-loss']
#loops through columns
for column_name, column in df.transpose().iterrows():
if column_name in CONT:
X = column_name
print(df.X.count())
else:
print('')
The values property is used to get a Numpy representation of the DataFrame. Only the values in the DataFrame will be returned, the axes labels will be removed. The values of the DataFrame. A DataFrame where all columns are the same type (e.g., int64) results in an array of the same type.
When you'd like to access just one value in a pandas DataFrame, both the loc and at functions will work fine. However, when you'd like to access a group of rows and columns, only the loc function is able to do so.
try:
for column_name, column in df.transpose().iterrows():
if column_name in CONT:
print(df[column_name].count())
else:
print('')
edit:
To answer your question more precisely:
You can use variables to select cols in 2 ways: df[list_of_columns]
will return a DataFrame with the subset of cols in list_of_columns
. df[column_name]
will return the Series for column_name
I think you can use subset
created from list
CONT
:
print df
age fnlwgt capital-gain
0 a 9th 5
1 b 9th 6
2 c 8th 3
CONT = ['age','fnlwgt']
print df[CONT]
age fnlwgt
0 a 9th
1 b 9th
2 c 8th
print df[CONT].count()
age 3
fnlwgt 3
dtype: int64
print df[['capital-gain']]
capital-gain
0 5
1 6
2 3
Maybe better as list
is dictionary
, which is created by to_dict
:
d = df[CONT].count().to_dict()
print d
{'age': 3, 'fnlwgt': 3}
print d['age']
3
print d['fnlwgt']
3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With