I want to read multiple files located in the same directory and then merge them into a single pandas data frame.
It works if I do it this way:
import pandas as pd
df1 = pd.read_csv("data/12015.csv")
df2 = pd.read_csv("data/22015.csv")
df3 = pd.read_csv("data/32015.csv")
df = pd.concat([df1, df2, df3])
However, I want to use more elegant solution that would especially useful if the amount of files is greater than 3.
I tried this approach, however I don't know how to apply concat
inside the for loop.
import pandas as pd
import os
from os import path
files = [x for x in os.listdir("data") if path.isfile("data"+os.sep+x)]
for f in files:
df = pd.read_csv("data/"+f)
You can use list comprehension to create the list of DataFrames to concat and then call pd.concat()
on that list. Example -
import pandas as pd
import os
from os import path
dfs = [pd.read_csv(path.join('data',x)) for x in os.listdir("data") if path.isfile(path.join("data",x))]
df = pd.concat(dfs)
And you should consider using os.path.join()
as I have used to create the paths, rather than concatenating the strings yourself.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With