Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Loop through a dictionary of dataframes

I have a set of dataframes that represent scenarios of demand that I have put into a dictionary. I need to loop through each dataframe in the dictionary to reindex and resample etc. and the return to the dictionary. The below code works perfectly when I loop through a list of dataframes but I need to maintain the identity of each scenario, hence the dictionary.

This is the code that works with a list of dataframes:

demand_dfs_list = [low_demand_df, med_low_demand_df, bc_demand_df, med_high_demand_df, high_demand_df]
dates = pd.date_range(start='2020-10-01', end='2070-09-30', freq='D')

demand_dfs_datetime = []
for df in demand_dfs_list:
    df.index = pd.to_datetime(df.index, format='%Y')
    df = df.tshift(-92, 'D')
    df = df.resample('D').ffill()
    df = df.reindex(dates)
    demand_dfs_datetime.append(df)

This is what I have tried in dictionary form:

demand_scenarios = {'low': low_demand_df, 'medium_low': med_low_demand_df, 'medium': bc_demand_df, 'medium_high': med_high_demand_df, 'high': high_demand_df}
dates = pd.date_range(start='2020-10-01', end='2070-09-30', freq='D')

demand_dict = {}
    for df in demand_scenarios:
        [df].index = pd.to_datetime([df].index, format='%Y')
        [df] = [df].tshift(-92, 'D')
        [df] = [df].resample('D').ffill()
        [df] = [df].reindex(dates)
        demand_dict[df] = df

FOLLOW UP QUESTION I passed the above demand_dict dictionary into an xarray using the below:

demand_xarray = xr.Dataset(demand_dict, coords = {'customers': customers, 'time': dates})

However my dataset looks like the following:

<xarray.Dataset>
Dimensions:      (customers: 28, dim_0: 18262, dim_1: 28, time: 18262)
Coordinates:
  * dim_0        (dim_0) datetime64[ns] 2020-10-01 2020-10-02 ... 2070-09-30
  * dim_1        (dim_1) object 'Customer_1' ... 'Customer_x'
  * customers    (customers) <U29 'Customer_1' ... 'Customer_x'
  * time         (time) datetime64[ns] 2020-10-01 2020-10-02 ... 2070-09-30
Data variables:
    low          (dim_0, dim_1) float64 0.52 0.528 3.704 ... 7.744 0.92 64.47
    medium_low   (dim_0, dim_1) float64 0.585 0.594 4.167 ... 8.712 1.035 72.53
    medium       (dim_0, dim_1) float64 0.65 0.66 4.63 12.6 ... 9.68 1.15 80.59
    medium_high  (dim_0, dim_1) float64 0.715 0.726 5.093 ... 10.65 1.265 88.65
    high         (dim_0, dim_1) float64 0.78 0.792 5.556 ... 11.62 1.38 96.71

When I try and use the drop_dims function like so:

demand_xarray = xr.Dataset(demand_dict, coords = {'customers': customers, 'time': dates}).drop_dims(dim_0, dim_1)

I get the error:

AttributeError: 'Dataset' object has no attribute 'drop_dims'

Any idea why I am getting this error?

like image 645
AlexaB Avatar asked Mar 28 '19 01:03

AlexaB


People also ask

Can you store Dataframes in dictionary?

A pandas DataFrame can be converted into a Python dictionary using the DataFrame instance method to_dict(). The output can be specified of various orientations using the parameter orient. In dictionary orientation, for each column of the DataFrame the column value is listed against the row label in a dictionary.

Can I convert DataFrame to dictionary?

to_dict() method is used to convert DataFrame to Dictionary (dict) object. Use this method If you have a DataFrame and want to convert it to python dictionary (dict) object by converting column names as keys and the data for each row as values. This method takes param orient which is used the specify the output format.

What is dictionary in panda?

Pandas, however, also uses dictionaries (next to other advanced data structures such as the NumPy array) to store its data. As a result, it is a good idea to know how a dictionary works before leaving the hard work, namely storing the data in the appropriate data structures, to Pandas.


1 Answers

demand_scenarios = {'low': low_demand_df, 'medium_low': med_low_demand_df, 'medium': bc_demand_df, 'medium_high': med_high_demand_df, 'high': high_demand_df}
dates = pd.date_range(start='2020-10-01', end='2070-09-30', freq='D')

demand_dict = {}
    for key, df in demand_scenarios.items():
        df.index = pd.to_datetime([df].index, format='%Y')
        df = df.tshift(-92, 'D')
        df = df.resample('D').ffill()
        df = df.reindex(dates)
        demand_dict[key] = df

items() return the key of the dictionary and the value

like image 75
Mauricio Cortazar Avatar answered Oct 17 '22 15:10

Mauricio Cortazar