Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SED to remove a Line with REGEX Pattern

Tags:

regex

bash

unix

sed

i've got a hundreds of files with thousands of lines, which i need to delete some lines that follows a pattern,so i went to SED with regex .The struct of files is something like this

A,12121212121212,foo,bar,lorem
C,32JL,JL
C,32JL,JL
C,32JL,JL
C,32JL,JL
A,21212121212121,foo,bar,lorem
C,32JL,JL
C,32JL,JL
C,32JL,JL
A,9999,88888,77777

I need to delete All the lines that starts with "A" and ends with "lorem"

Expected output-

C,32JL,JL
C,32JL,JL
C,32JL,JL
C,32JL,JL
C,32JL,JL
C,32JL,JL
C,32JL,JL
A,9999,88888,77777

I've made the Regex :

^(A).*(lorem)

And it match in my text editor (Sublime,UltraEdit)

In the UNIX shell

sed '/^(A).*(lorem)/d' file.txt

But somehow it doesn't work,it shows the whole file, and i can't figure out why.

Can someone help me please?

like image 957
Imkls Avatar asked Oct 25 '16 13:10

Imkls


People also ask

How do you delete a line that matches a pattern in Linux?

Sed Command to Delete Lines: Sed command can be used to delete or remove specific lines which matches a given pattern or in a particular position in a file. Here we will see how to delete lines using sed command with various examples.

How do you delete a line in a file using sed?

To delete a line, we'll use the sed “d” command. Note that you have to declare which line to delete. Otherwise, sed will delete all the lines.

Does sed work with regex?

The sed command has longlist of supported operations that can be performed to ease the process of editing text files. It allows the users to apply the expressions that are usually used in programming languages; one of the core supported expressions is Regular Expression (regex).

How do I remove a specific pattern from a Unix file?

N command reads the next line in the pattern space. d deletes the entire pattern space which contains the current and the next line. Using the substitution command s, we delete from the newline character till the end, which effective deletes the next line after the line containing the pattern Unix.


3 Answers

$ sed '/^A.*lorem$/d' file.txt
  • ^A: starts with an A
  • .*: stuff in the middle
  • lorem$: ends with lorem
like image 69
James Brown Avatar answered Oct 18 '22 16:10

James Brown


The others gave you correct solutions but didn't explain why your regex didn't work. The () surely were useless, but if you had used the regex with other tools/languages, you might very well have had the expected result.

It didn't work with sed because it will by default use POSIX's basic regular expressions, where the characters for grouping are \( and \), while ( and ) will match literal characters. There were no such brackets in your input text, so it didn't match.

Your regular expression would have worked if you had used GNU's sed -r or BSD's sed -E, the flag switching to POSIX's extended regular expressions where ( and ) are used to group and \( \) match the literal brackets.

In conclusion, the following commands will do the same thing :

  • sed '/^A.*lorem$/d' file.txt
  • sed -r '/^(A).*(lorem)$/d' file.txt (with GNU sed)
  • sed -E '/^(A).*(lorem)$/d' file.txt (with BSD sed and modern GNU sed)
  • sed '/^\(A\).*\(lorem\)$/d' file.txt
like image 33
Aaron Avatar answered Oct 18 '22 17:10

Aaron


Remove the brackets.

Using your code, the appropriate one-liner becomes-

sed '/^A.*lorem/d' file.txt

If you want to be more rigourous, you can look at James's answer which more correctly terminates the regex as-

sed '/^A.*lorem$/d' file.txt

Both will work.

like image 1
Chem-man17 Avatar answered Oct 18 '22 15:10

Chem-man17