Extracting particular characters/ text from DataFrame column

Tags:

I am trying to get the email provider from the mail column of the Dataframe and create a new column named "Mail_Provider". For example, taking gmail from [email protected] and storing it in "Mail_Provider" column. Also I would like to extract Country ISD fro Phone column and Create a new column for that. Is there any other straight/simpler method other than regex.

data = pd.DataFrame({"Name":["A","B","C"],"mail": 
["[email protected]","[email protected]","[email protected]"],"Adress": 
["Adress1","Adress2","Adress3"],"Phone":["+91-1234567890","+88- 
0987654321","+27-2647589201"]})

Table

Name   mail        Adress       Phone

A    [email protected]   Adress1  +91-1234567890
B    [email protected]   Adress2  +88-0987654321
C    [email protected]   Adress3  +27-2647589201

Result expected:-

Name   mail        Adress       Phone        Mail_Provider   ISD

A    [email protected]   Adress1  +91-1234567890    gmail           91
B    [email protected]   Adress2  +88-0987654321    yahoo           88
C    [email protected]   Adress3  +27-2647589201    gmail           27

662

asked Jul 30 '19 14:07

Devesh

2 Answers

Regex is rather simple as these:

data['Mail_Provider'] = data['mail'].str.extract('\@(\w+)\.')

data['ISD'] = data['Phone'].str.extract('\+(\d+)-')

If you really want to avoid regex, @Eva's answer would be the way to go.

190

answered Sep 21 '22 07:09

Quang Hoang

Mixed approach (regex and simple slicing):

In [693]: df['Mail_Provider'] = df['mail'].str.extract('@([^.]+)')

In [694]: df['ISD'] = df['Phone'].str[1:3]

In [695]: df
Out[695]: 
  Name         mail   Adress           Phone Mail_Provider ISD
0    A  [email protected]  Adress1  +91-1234567890         gmail  91
1    B  [email protected]  Adress2  +88-0987654321         yahoo  88
2    C  [email protected]  Adress3  +27-2647589201         gmail  27

answered Sep 23 '22 07:09

RomanPerekhrest

Related questions
                            
                                How to accept `choices` in Python argparse irrespective of case? [duplicate]
                            
                                Convert multiple boolean columns which names start with string `abc_` at once into integer dtype
                            
                                Distinguish button_press_event from drag and zoom clicks in matplotlib
                            
                                writing list of list into a COLUMNED .txt file via python
                            
                                Django REST Framework filter multiple fields
                            
                                Unpivot multiple columns with same name in pandas dataframe
                            
                                How to determine if one list contains another? [duplicate]
                            
                                How to filter the data use equal or greater than condition in the url?
                            
                                create() takes 1 positional argument but 2 were given?
                            
                                How to get size of filtered objectsCollection in boto3
                            
                                Syntax error in ternary if-else statement
                            
                                How to flip image with opencv and python( without cv2.flip)
                            
                                pandas change dtypes only columns of float64
                            
                                What is type <U12?
                            
                                Why do I need both condition branches for the rreverse function?
                            
                                Don't understand this AttributeError: module 'turtle' has no attribute 'Turtle' [duplicate]
                            
                                Tensorflow Python 3.7
                            
                                How to install and use basemap on Google Colab?
                            
                                remove prefix in all column names
                            
                                Generate random timeseries data with dates

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Extracting particular characters/ text from DataFrame column

Tags:

python

string

pandas

dataframe

Devesh

People also ask

2 Answers

Quang Hoang

RomanPerekhrest

Recent Activity

Donate For Us