Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I calculate an average of a range from a series within in a dataframe?

Im new to Python and working with data manipulation

I have a dataframe

df3
Out[22]: 
                           Breed Lifespan
0         New Guinea Singing Dog       18
1                      Chihuahua       17
2                     Toy Poodle       16
3           Jack Russell Terrier       16
4                       Cockapoo       16
..                           ...      ...
201                      Whippet   12--15
202  Wirehaired Pointing Griffon   12--14
203               Xoloitzcuintle       13
204                  Yorkie--Poo       14
205            Yorkshire Terrier   14--16

As you observe above, some of the lifespans are in a range like 14--16. The datatype of [Lifespan] is

type(df3['Lifespan'])
Out[24]: pandas.core.series.Series

I want it to reflect the average of these two numbers i.e. 15. I do not want any ranges. Just the average as a single digit. How do I do this?


1 Answers

Using split and expand=True

df = pd.DataFrame({'Breed': ['Dog1', 'Dog2'],
                   'Lifespan': [12, '14--15']})

df['Lifespan'] = (df['Lifespan']
 .astype(str).str.split('--', expand=True)
 .astype(float).mean(axis=1)
)

df
#   Breed   Lifespan
# 0 Dog1    12.0
# 1 Dog2    14.5
like image 111
stevemo Avatar answered Dec 04 '25 14:12

stevemo



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!