Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elegant way to work with multiple dataframes in Pandas

I have a bit of code which currently looks like this:

if os.path.isfile('D:\\df_1'):
    df_1 = pd.read_pickle('D:\\df_1')
else:
    df_1 = pd.DataFrame(columns = ['Date', 'Location', 'Product'])
if os.path.isfile('D:\\df_2'):
    df_2 = pd.read_pickle('D:\\df_2')
else:
    df_2 = pd.DataFrame(columns = ['Date', 'Location', 'Product'])

[...]

if os.path.isfile('D:\\df_20'):
    df_20 = pd.read_pickle('D:\\df_20')
else:
    df_20 = pd.DataFrame(columns = ['Date', 'Location', 'Product'])

Basically what I'm doing is checking if the Dataframe already exists, if it does load it, otherwise create an empty dataframe. I need this because then the code will try to append new data to each of the dataframe. So I will have something like:

[retrieve new data and clean it]
df_1 = pd.concat([df_1, df_1_new_data])

Do this for all the 20 dataframes I have (they contain different things, so I want to keep them separate), and then save them in order to retrieve them again the day after and add new data to them:

df_1.to_pickle('D:\\df_1')
df_2.to_pickle('D:\\df_2')
[...]
df_20.to_pickle('D:\\df_20')

Now, it's already quite heavy to do it with 20 dataframes, but I will probably need to add some more! Is there a way to read the different dataframes, and then write them to pickle in a for loop or something like this? So to reduce the lines of code for the many I have now to a simple 2 lines for loop? Thank you!

like image 853
giga Avatar asked Sep 13 '25 22:09

giga


1 Answers

DRY : you shouldn't write same stuff many times (more than once really).

Use functions, loops, other basic language tools.

def create_df(path):
    if os.path.isfile(path):
        df = pd.read_pickle(path)
    else:
        df = pd.DataFrame(columns = ['Date', 'Location', 'Product'])
    return df

all_paths = (...)

# dict where key is you path and value is dataframe    
all_df = {p: create_df(p) for p in all_paths}

for p in all_paths:
    all_df[p].to_pickle(p)
like image 178
Grail Finder Avatar answered Sep 16 '25 13:09

Grail Finder