i've got a hundreds of files with thousands of lines, which i need to delete some lines that follows a pattern,so i went to SED with regex .The struct of files is something like this
A,12121212121212,foo,bar,lorem
C,32JL,JL
C,32JL,JL
C,32JL,JL
C,32JL,JL
A,21212121212121,foo,bar,lorem
C,32JL,JL
C,32JL,JL
C,32JL,JL
A,9999,88888,77777
I need to delete All the lines that starts with "A" and ends with "lorem"
Expected output-
C,32JL,JL
C,32JL,JL
C,32JL,JL
C,32JL,JL
C,32JL,JL
C,32JL,JL
C,32JL,JL
A,9999,88888,77777
I've made the Regex :
^(A).*(lorem)
And it match in my text editor (Sublime,UltraEdit)
In the UNIX shell
sed '/^(A).*(lorem)/d' file.txt
But somehow it doesn't work,it shows the whole file, and i can't figure out why.
Can someone help me please?
Sed Command to Delete Lines: Sed command can be used to delete or remove specific lines which matches a given pattern or in a particular position in a file. Here we will see how to delete lines using sed command with various examples.
To delete a line, we'll use the sed “d” command. Note that you have to declare which line to delete. Otherwise, sed will delete all the lines.
The sed command has longlist of supported operations that can be performed to ease the process of editing text files. It allows the users to apply the expressions that are usually used in programming languages; one of the core supported expressions is Regular Expression (regex).
N command reads the next line in the pattern space. d deletes the entire pattern space which contains the current and the next line. Using the substitution command s, we delete from the newline character till the end, which effective deletes the next line after the line containing the pattern Unix.
$ sed '/^A.*lorem$/d' file.txt
^A
: starts with an A
.*
: stuff in the middlelorem$
: ends with lorem
The others gave you correct solutions but didn't explain why your regex didn't work. The ()
surely were useless, but if you had used the regex with other tools/languages, you might very well have had the expected result.
It didn't work with sed
because it will by default use POSIX's basic regular expressions, where the characters for grouping are \(
and \)
, while (
and )
will match literal characters. There were no such brackets in your input text, so it didn't match.
Your regular expression would have worked if you had used GNU's sed -r
or BSD's sed -E
, the flag switching to POSIX's extended regular expressions where (
and )
are used to group and \(
\)
match the literal brackets.
In conclusion, the following commands will do the same thing :
sed '/^A.*lorem$/d' file.txt
sed -r '/^(A).*(lorem)$/d' file.txt
(with GNU sed)sed -E '/^(A).*(lorem)$/d' file.txt
(with BSD sed and modern GNU sed)sed '/^\(A\).*\(lorem\)$/d' file.txt
Remove the brackets.
Using your code, the appropriate one-liner becomes-
sed '/^A.*lorem/d' file.txt
If you want to be more rigourous, you can look at James's answer which more correctly terminates the regex as-
sed '/^A.*lorem$/d' file.txt
Both will work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With