I have multiple CSV files with values like this in a folder:
The GroupID.csv is the filename. There are multiple files like this, but the value ranges are defined in the same XML file. I'm trying to group them How can I do that?
UPDATE1: Based on BobHaffner's comments, I've done this
import pandas as pd import glob path =r'path/to/files' allFiles = glob.glob(path + "/*.csv") frame = pd.DataFrame() list_ = [] for file_ in allFiles: df = pd.read_csv(file_,index_col=None, header=None) df['file'] = os.path.basename('path/to/files/'+file_) list_.append(df) frame = pd.concat(list_) print frame
to get something like this:
I need to group the values based on the bins from the XML file. I'd truly appreciate any help.
In order to bucket your series, you should use the pd.cut()
function, like this:
df['bin'] = pd.cut(df['1'], [0, 50, 100,200]) 0 1 file bin 0 person1 24 age.csv (0, 50] 1 person2 17 age.csv (0, 50] 2 person3 98 age.csv (50, 100] 3 person4 6 age.csv (0, 50] 4 person2 166 Height.csv (100, 200] 5 person3 125 Height.csv (100, 200] 6 person5 172 Height.csv (100, 200]
If you want to name the bins yourself, you can use the labels=
argument, like this:
df['bin'] = pd.cut(df['1'], [0, 50, 100,200], labels=['0-50', '50-100', '100-200']) 0 1 file bin 0 person1 24 age.csv 0-50 1 person2 17 age.csv 0-50 2 person3 98 age.csv 50-100 3 person4 6 age.csv 0-50 4 person2 166 Height.csv 100-200 5 person3 125 Height.csv 100-200 6 person5 172 Height.csv 100-200
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With