In R
there is a function called assign
which assigns a value to a name in the environment.
EG:
assign("Hello", 2)
> Hello
[1] 2
In python I can't seem to do the same. I initially tried:
import numpy as np
import pandas as pd
import os
for file in os.listdir('C:\\Users\\Olivia\\Documents'):
if file.endswith(".csv"):
os.path.splitext(file)[0] = pd.read_csv('C:\\Users\\Olivia\\Documents\\' + file)
But I can see this is trying to make a string equal to a file which doesn't work.
I managed to get all the files in a list by doing:
import glob
dl = glob.glob(r'C:\Users\Olivia\Documents\*.csv')
nl = []
for i in dl:
pl = i.split(os.sep)
name = pl[5][:-4]
nl.append(name)
ddict = {}
for k, v in zip(nl,dl):
ddict[k] = ddict.get(k,"") + v
dfl = []
for k, v in ddict.items():
dfl.append(read_csv(v))
But now how do I get each data frame out of the list and named as the file without the extension. There must be a way to assign each data frame in the list as a name from the file list
Honestly, you were on the right track with your first method. Unfortunately, python doesn't give you the option to create a "variable number of variables" dynamically, as you have tried and realised already. However! You can create a dictionary and assign dataframes to string keys as you like. Here's how.
root = 'C:\\Users\\Olivia\\Documents'
ddict = {}
for file in os.listdir(root):
if file.endswith(".csv"):
name = os.path.splitext(file)[0]
ddict[name] = pd.read_csv(os.path.join(root, file))
Another way of building this dictionary is using a dict comprehension:
ddict = {os.path.splitext(file)[0] : pd.read_csv(os.path.join(root, file))
for file in os.listdir(root) if file.endswith('csv')
}
Now, referring to a single dataframe is as easy as
ddict['your_file_name']
Another thing to note, the safest way to join files is using os.path.join
. It's just safer than a plain +
.
References
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With