I'm trying to use a regexp using sed
. I've tested my regex with kiki, a gnome application to test regexpd, and it works in kiki.
date: 2010-10-29 14:46:33 -0200; author: 00000000000; state: Exp; lines: +5 -2; commitid: bvEcb00aPyqal6Uu;
I want to replace author: 00000000000;
with nothing. So, I created the regexp, that works when I test it in kiki:
author:\s[0-9]{11};
But doesn't work when I test it in sed
.
sed -i "s/author:\s[0-9]{11};//g" /tmp/test_regex.txt
I know regex have different implementations, and this could be the issue. My question is: how do I at least try do "debug" what's happening with sed? Why is it not working?
Although the simple searching and sorting can be performed using sed command, using regex with sed enables advanced level matching in text files. The regex works on the directions of characters used; these characters guide the sed command to perform the directed tasks.
r is used to read a file and append it at the current point. The point in your example is the address /EOF/ which means this script will find the line containing EOF and then append the file specified by $thingToAdd after that point. Then it will process the rest of the file.
5.1 Overview of regular expression in sed A regular expression is a pattern that is matched against a subject string from left to right. Most characters are ordinary: they stand for themselves in a pattern, and match the corresponding characters. Regular expressions in sed are specified between two slashes.
sed ("stream editor") is a Unix utility that parses and transforms text, using a simple, compact programming language.
My version of sed
doesn't like the {11}
bit. Processing the line with:
sed 's/author: [0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9];//g'
works fine.
And the way I debug it is exactly what I did here. I just constructed a command:
echo 'X author: 00000000000; X' | sed ...
and removed the more advanced regex things one at a time:
<space>
instead of \s
, didn't fix it.[0-9]{11}
with 11 copies of [0-9]
, that worked.It pretty much had to be one of those since I've used every other feature of your regex before with sed
successfully.
But, in fact, this will actually work without the hideous 11 copies of [0-9]
, you just have to escape the braces [0-9]\{11\}
. I have to admit I didn't get around to trying that since it worked okay with the multiples and I generally don't concern myself too much with brevity in sed
since I tend to use it more for quick'n'dirty jobs :-)
But the brace method is a lot more concise and adaptable and it's good to know how to do it.
In sed you need to escape the curly braces. "s/author:\s[0-9]\{11\};//g"
should work.
Sed has no debug capability. To test you simplify at the command line iteratively until you get something to work and then build back up.
command line input:
$ echo 'xx a: 00123 b: 5432' | sed -e 's/a:\s[0-9]\{5\}//'
command line output:
xx b: 5432
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With