I find myself doing repetitive tasks to various [pandas][1]
DataFrames, so I made a function to do the processing. How do I modify df
in the function process_df(df)
so that the caller sees all changes (without assigning a return value)?
A simplified version of the code:
def process_df(df):
df.columns = map(str.lower, df.columns)
df = pd.DataFrame({'A': [1], 'B': [2]})
process_df(df)
print df
A B 0 1 2
EDIT new code:
def process_df(df):
df = df.loc[:, 'A']
df = pd.DataFrame({'A': [1], 'B': [2]})
process_df(df)
print df
A B 0 1 2
Indexing a DataFrame
using ix
, loc
, iloc
, etc. returns a view of the underlying data (it is a read operation). In order to modify the contents of the frame you will need to use in-place transforms. For example,
def process_df(df):
# drop all columns except for A
df.drop(df.columns[df.columns != 'A'], axis=1, inplace=True)
df = DataFrame({'A':[1,2,3], 'B':[1,2,3]})
process_df(df)
To change the order of columns, you can do something like this:
def process_df(df):
# swap A and B
df.columns = ['B', 'A']
df[['B', 'A']] = df[['A', 'B']]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With