Making the replace case insensitive does not seem to have an effect in the following example (I want to replace jr. or Jr. with jr):
In [0]: pd.Series('Jr. eng').str.replace('jr.', 'jr', regex=False, case=False)
Out[0]: 0 Jr. eng
Why? What am I misunderstanding?
str. contains has a case parameter that is True by default. Set it to False to do a case insensitive match.
You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.
Is the String replace function case sensitive? Yes, the replace function is case sensitive. That means, the word “this” has a different meaning to “This” or “THIS”. In the following example, a string is created with the different case letters, that is followed by using the Python replace string method.
You can replace substring of pandas DataFrame column by using DataFrame. replace() method. This method by default finds the exact sting match and replaces it with the specified value. Use regex=True to replace substring.
The case
argument is actually a convenience as an alternative to specifying flags=re.IGNORECASE
. It has no bearing on replacement if the replacement is not regex-based.
So, when regex=True
, these are your possible choices:
pd.Series('Jr. eng').str.replace(r'jr\.', 'jr', regex=True, case=False)
# pd.Series('Jr. eng').str.replace(r'jr\.', 'jr', case=False)
0 jr eng
dtype: object
Or,
pd.Series('Jr. eng').str.replace(r'jr\.', 'jr', regex=True, flags=re.IGNORECASE)
# pd.Series('Jr. eng').str.replace(r'jr\.', 'jr', flags=re.IGNORECASE)
0 jr eng
dtype: object
You can also get cheeky and bypass both keyword arguments by incorporating the case insensitivity flag as part of the pattern as ?i
. See
pd.Series('Jr. eng').str.replace(r'(?i)jr\.', 'jr')
0 jr eng
dtype: object
Note
You will need to escape the period\.
in regex mode, because the unescaped dot is a meta-character with a different meaning (match any character). If you want to dynamically escape meta-chars in patterns, you can usere.escape
.
For more information on flags and anchors, see this section of the docs and re
HOWTO.
From the source code, it is clear that the "case" argument is ignored if regex=False
. See
# Check whether repl is valid (GH 13438, GH 15055) if not (is_string_like(repl) or callable(repl)): raise TypeError("repl must be a string or callable") is_compiled_re = is_re(pat) if regex: if is_compiled_re: if (case is not None) or (flags != 0): raise ValueError("case and flags cannot be set" " when pat is a compiled regex") else: # not a compiled regex # set default case if case is None: case = True # add case flag, if provided if case is False: flags |= re.IGNORECASE if is_compiled_re or len(pat) > 1 or flags or callable(repl): n = n if n >= 0 else 0 compiled = re.compile(pat, flags=flags) f = lambda x: compiled.sub(repl=repl, string=x, count=n) else: f = lambda x: x.replace(pat, repl, n)
You can see the case
argument is only checked inside the if
statement.
IOW, the only way is to ensure regex=True
so that replacement is regex-based.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With