Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to split a panda's column with conditional

Tags:

python

pandas

I have a Log df, in that df i have the column Description. Looks like.

Description
Machine x : Turn off
Another action here
Another action here
Machine y : Turn off
Machine x : Turn on
Another action here

And I need split just rows with ":"

Like:

Description               Machine           Action
Machine x : Turn off      Machine x         Turn off
Another action here
Another action here
Machine y : Turn off      Machine y         Turn off
Machine x : Turn on       Machine x         Turn on
Another action here

I already tried:

s = df["Description"].apply(lambda x:x.split(":"))
df["Action"] = s.apply(lambda x: x[1])
df["Machine"] = s.apply(lambda x: x[0])

And something with "startswith".

like image 974
Miguel Salas Avatar asked Oct 28 '25 08:10

Miguel Salas


2 Answers

You can use str.extract with a suitable regex. This will find all values around the : (also stripping the spaces around the colon):

df[['Machine', 'Action']] = df.Description.str.extract('(.*) : (.*)',expand=True)

>>> df
            Description    Machine    Action
0  Machine x : Turn off  Machine x  Turn off
1   Another action here        NaN       NaN
2   Another action here        NaN       NaN
3  Machine y : Turn off  Machine y  Turn off
4   Machine x : Turn on  Machine x   Turn on
5   Another action here        NaN       NaN

# df[['Machine', 'Action']] = df.Description.str.extract('(.*) : (.*)',expand=True).fillna('')
like image 97
sacuL Avatar answered Oct 29 '25 21:10

sacuL


Given a dataframe

>>> df
            Description
0  Machine x : Turn off
1   Another action here
2   Another action here
3  Machine y : Turn off
4   Machine x : Turn on
5   Another action here

I'd approach this via Series.str.split(splitter, expand=True).

>>> has_colon = df['Description'].str.contains(':')
>>> df[['Machine', 'Action']] = df.loc[has_colon, 'Description'].str.split('\s*:\s*', expand=True)
>>> df
            Description    Machine    Action
0  Machine x : Turn off  Machine x  Turn off
1   Another action here        NaN       NaN
2   Another action here        NaN       NaN
3  Machine y : Turn off  Machine y  Turn off
4   Machine x : Turn on  Machine x   Turn on
5   Another action here        NaN       NaN

If you prefer empty strings, you can replace the NaN cells via

>>> df.fillna('')
            Description    Machine    Action
0  Machine x : Turn off  Machine x  Turn off
1   Another action here                     
2   Another action here                     
3  Machine y : Turn off  Machine y  Turn off
4   Machine x : Turn on  Machine x   Turn on
5   Another action here 
like image 43
timgeb Avatar answered Oct 29 '25 23:10

timgeb