I'm trying to write a script that imports a file, then does something with the file and outputs the result into another file.
df = pd.read_csv('somefile2018.csv')
The above code works perfectly fine. However, I'd like to avoid hardcoding the file name in the code.
The script will be run in a folder (directory) that contains the script.py
and several csv files.
I've tried the following:
somefile_path = glob.glob('somefile*.csv')
df = pd.read_csv(somefile_path)
But I get the following error:
ValueError: Invalid file path or buffer object type: <class 'list'>
glob
returns a list, not a string. The read_csv
function takes a string as the input to find the file. Try this:
for f in glob('somefile*.csv'):
df = pd.read_csv(f)
...
# the rest of your script
To read all of the files that follow a certain pattern, so long as they share the same schema, use this function:
import glob
import pandas as pd
def pd_read_pattern(pattern):
files = glob.glob(pattern)
df = pd.DataFrame()
for f in files:
df = df.append(pd.read_csv(f))
return df.reset_index(drop=True)
df = pd_read_pattern('somefile*.csv')
This will work with either an absolute or relative path.
You can get the list of the CSV files in the script and loop over them.
from os import listdir
from os.path import isfile, join
mypath = os.getcwd()
csvfiles = [f for f in listdir(mypath) if isfile(join(mypath, f)) if '.csv' in f]
for f in csvfiles:
pd.read_csv(f)
# the rest of your script
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With