I have a column of age values that I need to convert to age ranges of 18-29, 30-39, 40-49, 50-59, 60-69, and 70+:
For an example of some of the data in df 'file', I have:
and would like to get to:
I tried the following:
file['agerange'] = file[['age']].apply(lambda x: "18-29" if (x[0] > 16
or x[0] < 30) else "other")
I would prefer not to just do a groupby since the bucket sizes aren't uniform but I'd be open to that as a solution if it works.
Thanks in advance!
It looks like you are using the Pandas library. They include a function for doing this: http://pandas.pydata.org/pandas-docs/version/0.16.0/generated/pandas.cut.html
Here's my attempt:
import pandas as pd
ages = pd.DataFrame([81, 42, 18, 55, 23, 35], columns=['age'])
bins = [18, 30, 40, 50, 60, 70, 120]
labels = ['18-29', '30-39', '40-49', '50-59', '60-69', '70+']
ages['agerange'] = pd.cut(ages.age, bins, labels = labels,include_lowest = True)
print(ages)
age agerange
0 81 70+
1 42 40-49
2 18 18-29
3 55 50-59
4 23 18-29
5 35 30-39
Wouldn't a nested loop be the simplest solution here?
import random
ages = [random.randint(18, 100) for _ in range(100)]
age_ranges = [(18,29), (30,39), (40,49), (50,59), (60,69),(70,)]
for a in ages:
for r in age_ranges:
if a >= r[0] and (len(r) == 1 or a < r[1]):
print a,r
break
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With