I currently have a list of Pandas DataFrames. I'm trying to perform an operation on each list element (i.e. each DataFrame contained in the list) and then save that DataFrame to a CSV file.
I assigned a name
attribute to each DataFrame, but I realized that in some cases the program throws an error AttributeError: 'DataFrame' object has no attribute 'name'
.
Here's the code that I have.
# raw_og contains the file names for each CSV file.
# df_og is the list containing the DataFrame of each file.
for idx, file in enumerate(raw_og):
df_og.append(pd.read_csv(os.path.join(data_og_dir, 'raw', file)))
df_og[idx].name = file
# I'm basically checking if the DataFrame is in reverse-chronological order using the
# check_reverse function. If it is then I simply reverse the order and save the file.
for df in df_og:
if (check_reverse(df)):
df = df[::-1]
df.to_csv(os.path.join(data_og_dir, 'raw_new', df.name), index=False)
else:
continue
The program is throwing an error in the second for loop where I used df.name
.
This is especially strange because when I run print(df.name)
it prints out the file name. Would anybody happen to know what I'm doing wrong?
Thank you.
the solution is to use a loc to set the values, rather than creating a copy.
creating a copy of df loses the name:
df = df[::-1] # creates a copy
setting the value 'keeps' the original object intact, along with name
df.loc[:] = df[:, ::-1] # reversal maintaining the original object
Example code that reverses values along the column axis:
df = pd.DataFrame([[6,10]], columns=['a','b'])
df.name='t'
print(df.name)
print(df)
df.iloc[:] = df.iloc[:,::-1]
print(df)
print(df.name)
outputs:
t
a b
0 6 10
a b
0 10 6
t
A workaround is to set a columns.name
and use it when needed.
Example:
df = pd.DataFrame()
df.columns.name = 'name'
print(df.columns.name)
name
I suspect it's the reversal that loses the custom .name attribute.
In [11]: df = pd.DataFrame()
In [12]: df.name = 'empty'
In [13]: df.name
Out[13]: 'empty'
In [14]: df[::-1].name
AttributeError: 'DataFrame' object has no attribute 'name'
You'll be better off storing a dict of dataframes rather than using .name:
df_og = {file: pd.read_csv(os.path.join(data_og_dir, 'raw', fn) for fn in raw_og}
Then you could iterate through this and reverse the values that need reversing...
for fn, df in df_og.items():
if (check_reverse(df)):
df = df[::-1]
df.to_csv(os.path.join(data_og_dir, 'raw_new', fn), index=False)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With