Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex - Replace \\n and \n in string by <br> but not \\\\n

Tags:

python

regex

I am trying to replace all \\n and \n from a string by <br> but I dont want the \\\\n to be replaced.

Im trying with the following expression using negative lookbehind but it does not return the correct result because all the "n" and "\" are replaced:

import re
string = "this is a sample\\nstring\\\\ncontaining\nall cases\n\\n\n!"
re.sub(r"(?<!\\)([\\n\n]+)", r"<br>", string)
>>'this is a sample<br>stri<br>g<br>co<br>tai<br>i<br>g<br>all cases<br>!'

expected output

"this is a sample<br>string\\\\ncontaining<br>all cases<br><br><br>!"
like image 954
Below the Radar Avatar asked Feb 23 '18 14:02

Below the Radar


2 Answers

This will do the magic:

re.sub(r"(?<!\\)\\n|\n", "<br>", string)

Note: it will replace line breaks ("\n") and escaped line breaks ("\n" or r"\n"). It does not escape "\\n" (or r"\n"). "\\\n" (backslash + new line) becomes "\\< br>".

Maybe, what you really want is:

re.sub(r"(?<!\\)(\\\\)*\\n|\n", "\1<br>", string)

This replaces all new lines and all escaped n (r"\n"). r"\\n" is not replaced. r"\\\n" is again replaced (escaped backslash + escaped n).

like image 197
dercolamann Avatar answered Oct 28 '22 10:10

dercolamann


Your regex has the character class [\\n\n] which matches \, n, or \n. Your lookbehind logic is correct, you just need to change your character class to a different subpattern: \\{1,2}n.

See regex in use here

(?<!\\)\\{1,2}n
  • (?<!\\) Negative lookbehind ensuring what precedes is not \
  • \\{1,2} Match \ once or twice
  • n Match this literally

Replacement: <br>

Alternative: (?<!\\)\\\\?n as provided by @revo in the comments below the question


Usage in code

See in use here

import re

r = re.compile(r"(?<!\\)\\{1,2}n")
s = r"this is a sample\\nstring\\\\ncontaining\nall cases\n\\n\n!"
print(r.sub("<br>", s, 0))

Result: this is a sample<br>string\\\\ncontaining<br>all cases<br><br><br>!

like image 44
ctwheels Avatar answered Oct 28 '22 10:10

ctwheels