Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python pandas Removing UserWarning and looping efficiently

Lets say I have code similar to this:

import pandas as pd

df=pd.DataFrame({'Name': [ 'Jay Leno', 'JayLin', 'Jay-Jameson', 'LinLeno', 'Lin Jameson', 'Python Leno', 'Python Lin', 'Python Jameson', 'Lin Jay', 'Python Monte'],
                 'Class': ['Rat','L','H','L','L','H', 'H','L','L','Circus']})
df['status']=''

pattern1=['^Jay(\s|-)?(Leno|Lin|Jameson)$','^Python(\s|-)?(Jay|Leno|Lin|Jameson|Monte)$','^Lin(\s|-)?(Leno|Jay|Jameson|Monte)$' ]
pattern2=['^Python(\s|-)?(Jay|Leno|Lin|Jameson|Monte)$' ]
pattern3=['^Lin(\s|-)?(Leno|Jay|Jameson|Monte)$' ]

for i in range(len(pattern1)):
    df.loc[df.Name.str.contains(pattern1[i]),'status'] = 'A'

for i in range(len(pattern2)):
    df.loc[df.Name.str.contains(pattern2[i]),'status'] = 'B'

for i in range(len(pattern3)):
    df.loc[df.Name.str.contains(pattern3[i]),'status'] = 'C'

print (df)

Which prints:

C:\Python33\lib\site-packages\pandas\core\strings.py:184: UserWarning: This pattern has match groups. To actually get the groups, use str.extract.
  " groups, use str.extract.", UserWarning)
    Class            Name status
0     Rat        Jay Leno      A
1       L          JayLin      A
2       H     Jay-Jameson      A
3       L         LinLeno      C
4       L     Lin Jameson      C
5       H     Python Leno      B
6       H      Python Lin      B
7       L  Python Jameson      B
8       L         Lin Jay      C
9  Circus    Python Monte      B

[10 rows x 3 columns]

My questions are how do I remove the error and Is there a way to loop through more efficiently with less code? I know there is something called list comprehensions but I am confused on how to use them.

I know the errors can be suppressed with

pd.options.mode.chained_assignment = None
like image 877
ccsv Avatar asked Dec 25 '22 12:12

ccsv


1 Answers

Use non-capturing parentheses (?:...):

pattern1=['^Jay(?:\s|-)?(?:Leno|Lin|Jameson)$','^Python(?:\s|-)?(?:Jay|Leno|Lin|Jameson|Monte)$','^Lin(?:\s|-)?(?:Leno|Jay|Jameson|Monte)$' ]
pattern2=['^Python(?:\s|-)?(?:Jay|Leno|Lin|Jameson|Monte)$' ]
pattern3=['^Lin(?:\s|-)?(?:Leno|Jay|Jameson|Monte)$' ]

The warning comes from this code:

    if regex.groups > 0:
        warnings.warn("This pattern has match groups. To actually get the"
                      " groups, use str.extract.", UserWarning)

So as long as there are no groups, there is no warning.

like image 102
unutbu Avatar answered Jan 19 '23 00:01

unutbu