Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting number of days from timedelta column in pandas

I have a Dataframe that stores aging value as below:

Aging
-84 days +11:36:15.000000000
-46 days +12:25:48.000000000
-131 days +20:53:45.000000000
-131 days +22:22:50.000000000
-130 days +01:02:03.000000000
-80 days +17:02:55.000000000

I am trying to extract the text before days in the above column. I tried the below:

df['new'] = df.Aging.split('days')[0]

The above returns

AttributeError: 'Series' object has no attribute 'split'

Expected output:

-84
-46
-131
-131
-130
-80
like image 845
hello kee Avatar asked Dec 21 '18 05:12

hello kee


People also ask

How do you use pandas Timedelta?

Using the top-level pd. to_timedelta , you can convert a scalar, array, list, or Series from a recognized timedelta format / value into a Timedelta type. It will construct Series if the input is a Series, a scalar if the input is scalar-like, otherwise it will output a TimedeltaIndex .

What is DT days in Python?

dt. day attribute to return the day of the datetime in the underlying data of the given Series object.


1 Answers

IMO, a better idea would be to convert to timedelta and extract the days component.

pd.to_timedelta(df.Aging, errors='coerce').dt.days

0    -84
1    -46
2   -131
3   -131
4   -130
5    -80
Name: Aging, dtype: int64

If you insist on using string methods, you can use str.extract.

pd.to_numeric(
    df.Aging.str.extract('(.*?) days', expand=False),
    errors='coerce')

0    -84
1    -46
2   -131
3   -131
4   -130
5    -80
Name: Aging, dtype: int32

Or, using str.split

pd.to_numeric(df.Aging.str.split(' days').str[0], errors='coerce')

0    -84
1    -46
2   -131
3   -131
4   -130
5    -80
Name: Aging, dtype: int64
like image 142
cs95 Avatar answered Oct 12 '22 23:10

cs95