Bin values based on ranges with pandas [duplicate]

Question

I have multiple CSV files with values like this in a folder:

The GroupID.csv is the filename. There are multiple files like this, but the value ranges are defined in the same XML file. I'm trying to group them How can I do that?

UPDATE1: Based on BobHaffner's comments, I've done this

import pandas as pd  import glob path =r'path/to/files'  allFiles = glob.glob(path + "/*.csv") frame = pd.DataFrame() list_ = [] for file_ in allFiles:     df = pd.read_csv(file_,index_col=None, header=None)     df['file'] = os.path.basename('path/to/files/'+file_)     list_.append(df) frame = pd.concat(list_) print frame

to get something like this:

I need to group the values based on the bins from the XML file. I'd truly appreciate any help.

firelynx · Accepted Answer

In order to bucket your series, you should use the pd.cut() function, like this:

df['bin'] = pd.cut(df['1'], [0, 50, 100,200])           0    1        file         bin 0  person1   24     age.csv     (0, 50] 1  person2   17     age.csv     (0, 50] 2  person3   98     age.csv   (50, 100] 3  person4    6     age.csv     (0, 50] 4  person2  166  Height.csv  (100, 200] 5  person3  125  Height.csv  (100, 200] 6  person5  172  Height.csv  (100, 200]

If you want to name the bins yourself, you can use the labels= argument, like this:

df['bin'] = pd.cut(df['1'], [0, 50, 100,200], labels=['0-50', '50-100', '100-200'])           0    1        file      bin 0  person1   24     age.csv     0-50 1  person2   17     age.csv     0-50 2  person3   98     age.csv   50-100 3  person4    6     age.csv     0-50 4  person2  166  Height.csv  100-200 5  person3  125  Height.csv  100-200 6  person5  172  Height.csv  100-200

Bin values based on ranges with pandas [duplicate]

Tags:

pam

Video Answer

1 Answers

firelynx

Recent Activity

Donate For Us

Bin values based on ranges with pandas [duplicate]

Tags:

pam

Video Answer

1 Answers

firelynx

Related questions

Recent Activity

Donate For Us