Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression with carriage return

Tags:

I'm trying to write a regular expression to extra the value which follows the word 'Total' but I'm not sure how to handle the carriage return which means i'm searching over 2 separate lines. does anyone know the best way to approach this ?

Taxes&Charges↵ ↵ £ 35.97↵ ↵ Total↵ £ 198.98↵ ↵ £ 35.97↵ ↵ ↵ Total↵ £ 333.98 
like image 813
cuthbert Avatar asked Apr 03 '11 22:04

cuthbert


People also ask

How do you match line breaks in RegEx?

If you want to indicate a line break when you construct your RegEx, use the sequence “\r\n”. Whether or not you will have line breaks in your expression depends on what you are trying to match. Line breaks can be useful “anchors” that define where some pattern occurs in relation to the beginning or end of a line.

What does \\ mean in regular expression?

You also need to use regex \\ to match "\" (back-slash). Regex recognizes common escape sequences such as \n for newline, \t for tab, \r for carriage-return, \nnn for a up to 3-digit octal number, \xhh for a two-digit hex code, \uhhhh for a 4-digit Unicode, \uhhhhhhhh for a 8-digit Unicode.

What is difference [] and () in RegEx?

In other words, square brackets match exactly one character. (a-z0-9) will match two characters, the first is one of abcdefghijklmnopqrstuvwxyz , the second is one of 0123456789 , just as if the parenthesis weren't there. The () will allow you to read exactly which characters were matched.

What is the symbol for carriage return?

CR = Carriage Return ( \r , 0x0D in hexadecimal, 13 in decimal) — moves the cursor to the beginning of the line without advancing to the next line.


2 Answers

In regex you should use the \r to catch the carriage return and \r\n to catch the line breaks

like image 52
pcofre Avatar answered Oct 13 '22 04:10

pcofre


You should use regex option dot matches newline (if supported).

E.g. in .NET you could use RegexOptions.Singleline for this. It specifies single-line mode. Changes the meaning of the dot (.) so it matches every character (instead of every character except \n).

The next expression:

Regex ex = new Regex(@"(?<=Total\r\n)£\s?[\d.]+", RegexOptions.Singleline); 

will match £ 198.98 and £ 333.98 values from your test example.

like image 38
Oleks Avatar answered Oct 13 '22 04:10

Oleks