Considering the following 2 lists of 3 dicts and 3 empty DataFrames
dict0={'actual': {'2013-02-20 13:30:00': 0.93}}
dict1={'actual': {'2013-02-20 13:30:00': 0.85}}
dict2={'actual': {'2013-02-20 13:30:00': 0.98}}
dicts=[dict0, dict1, dict2]
df0=pd.DataFrame()
df1=pd.DataFrame()
df2=pd.DataFrame()
dfs=[df0, df1, df2]
I want to recursively modify the 3 Dataframes within a loop, by using the following line:
for df, dikt in zip(dfs, dicts):
df = df.from_dict(dikt, orient='columns', dtype=None)
However, when trying to retrieve for instance 1 of the df outside of the loop, it is still empty
print (df0)
will return
Empty DataFrame
Columns: []
Index: []
When printing the df from within the for loop, we can see the data is correctly appended though.
How to make the loop so that it is possible to print the 3 dfs with their changes outside of the loop?
In your loop, df
is just a temporary value, not a reference to the corresponding list element. If you want to modify the list while iterating it, you have to reference the list by index. You can do that using Python's enumerate:
for i, (df, dikt) in enumerate(zip(dfs, dicts)):
dfs[i] = df.from_dict(dikt, orient='columns', dtype=None)
This will get it done in place!!!
Please note the 3 exclamations
one liner
[dfs[i].set_value(r, c, v)
for i, dn in enumerate(dicts)
for r, dr in dn.items()
for c, v in dr.items()];
somewhat more intuitive
for d, df in zip(dicts, dfs):
temp = pd.DataFrame(d).stack()
for (r, c), v in temp.iteritems():
df.set_value(r, c, v)
df0
actual
2013-02-20 13:30:00 0.93
equivalent alternative
without the pd.DataFrame
construction
for i, dn in enumerate(dicts):
for r, dr in dn.items():
for c, v in dr.items():
dfs[i].set_value(r, c, v)
Why is this different?
All the other answers, so far, reassign a new dataframe to the requisite position in the list of dataframes. They clobber the dataframe that was there. The original dataframe is left empty while a new non-empty one rests in the list.
This solution edits the dataframe in place ensuring the original dataframe is updated with new information.
Per OP:
However, when trying to retrieve for instance 1 of the df outside of the loop, it is still empty
timing
It's also considerably faster
setup
dict0={'actual': {'2013-02-20 13:30:00': 0.93}}
dict1={'actual': {'2013-02-20 13:30:00': 0.85}}
dict2={'actual': {'2013-02-20 13:30:00': 0.98}}
dicts=[dict0, dict1, dict2]
df0=pd.DataFrame()
df1=pd.DataFrame()
df2=pd.DataFrame()
dfs=[df0, df1, df2]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With