Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to split based on multiple delimiter pandas

I have a df column which contains

Phone number
12399422/930201021
5451354;546325642
789888744,656313214
123456654

I would like to separate it into two columns

Phone number1      Phone number2
12399422           930201021
5451354            546325642
789888744          656313214
123456654

I have tried to use apply,

    df['TELEPHONE1'] = df['TELEPHONE'].str.split(',').str.get(0)
    df['TELEPHONE2'] = df['TELEPHONE'].str.split(',').str.get(1)
    df['TELEPHONE1'] = df['TELEPHONE'].str.split(';').str.get(0)
    df['TELEPHONE2'] = df['TELEPHONE'].str.split(';').str.get(1)
    df['TELEPHONE1'] = df['TELEPHONE'].str.split('/').str.get(0)
    df['TELEPHONE2'] = df['TELEPHONE'].str.split('/').str.get(1)

but it can only split '/' Thank you for your help

like image 339
Platalea Minor Avatar asked Dec 14 '22 13:12

Platalea Minor


2 Answers

Using str.split

df.Phonenumber.str.split('/|;|,',expand=True).add_prefix('Number_')
    Number_0   Number_1
0   12399422  930201021
1    5451354  546325642
2  789888744  656313214
3  123456654       None
like image 190
BENY Avatar answered Dec 22 '22 17:12

BENY


Just split on \D (non-digit):

(df['Phone number'].str.split(r'\D', expand=True)
    .rename(columns=lambda x: f'num{x+1}'))

        num1       num2
0   12399422  930201021
1    5451354  546325642
2  789888744  656313214
3  123456654       None

...Under the assumption that phone numbers only contain digits.


Alternatively, use extractall and match digits only:

df['Phone number'].str.extractall(r'(\d+)')[0].unstack()

match          0          1
0       12399422  930201021
1        5451354  546325642
2      789888744  656313214
3      123456654        NaN
like image 34
cs95 Avatar answered Dec 22 '22 17:12

cs95