I have problem No objects to concatenate
. I can not import .csv files from main and its subdirectories to concatenate them into one DataFrame. I am using pandas. Old answers did not help me so please do not mark as duplicated.
Folder structure is like that
main/*.csv
main/name1/name1/*.csv
main/name1/name2/*.csv
main/name2/name1/*.csv
main/name3/*.csv
import pandas as pd
import os
import glob
folder_selected = 'C:/Users/jacob/Documents/csv_files'
frame = pd.concat(map(pd.read_csv, glob.iglob(os.path.join(folder_selected, "/*.csv"))))
csv_paths = glob.glob('*.csv')
dfs = [pd.read_csv(folder_selected) for folder_selected in csv_paths]
df = pd.concat(dfs)
all_files = []
all_files = glob.glob (folder_selected + "/*.csv")
file_path = []
for file in all_files:
df = pd.read_csv(file, index_col=None, header=0)
file_path.append(df)
frame = pd.concat(file_path, axis=0, ignore_index=False)
You need to search the subdirectories recursively.
folder = 'C:/Users/jacob/Documents/csv_files'
path = folder+"/**/*.csv"
glob.iglob
df = pd.concat(map(pd.read_csv, glob.iglob(path, recursive=True)))
glob.glob
csv_paths = glob.glob(path, recursive=True)
dfs = [pd.read_csv(csv_path) for csv_path in csv_paths]
df = pd.concat(dfs)
os.walk
file_paths = []
for base, dirs, files in os.walk(folder):
for file in fnmatch.filter(files, '*.csv'):
file_paths.append(os.path.join(base, file))
df = pd.concat([pd.read_csv(file) for file in file_paths])
pathlib
from pathlib import Path
files = Path(folder).rglob('*.csv')
df = pd.concat(map(pd.read_csv, files))
Check Dask Library as following, which reads many files to one df
>>> import dask.dataframe as dd
>>> df = dd.read_csv('data*.csv')
Read their docs https://examples.dask.org/dataframes/01-data-access.html#Read-CSV-files
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With