Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove a word prefix using grep?

How can I remove the beginning of a word using grep ? Ex: I have a file that contains:

www.abc.com

I only need the part

abc.com

Sorry for the basic question. But have no experience with Linux.

like image 426
Jury A Avatar asked Jul 26 '12 15:07

Jury A


1 Answers

You don't edit strings with grep in Unix shell, grep is usually used to find or remove some lines from the text. You'd rather use sed instead:

$ echo www.example.com | sed 's/^[^\.]\+\.//'
example.com

You'll need to learn regular expressions to use it effectively.

Sed can also edit file in-place (modify the file), if you pass -i argument, but be careful, you can easily lose data if you write the wrong sed command and use -i flag.

An example

From your comments guess you have a TeX document, and your want to remove the first part of all .com domain names. If it is your document test.tex:

\documentclass{article}
\begin{document}
www.example.com
example.com www.another.domain.com
\end{document}

then you can transform it with this sed command (redirect output to file or edit in-place with -i):

$ sed 's/\([a-z0-9-]\+\.\)\(\([a-z0-9-]\+\.\)\+com\)/\2/gi' test.tex 
\documentclass{article}
\begin{document}
example.com
example.com another.domain.com
\end{document}

Please note that:

  • A common sequence of allowed symbols followed by a dot is matched by [a-z0-9-]\+\.
  • I used groups in the regular expression (parts of it within \( and \)) to indicate the first and the second part of the URL, and I replace the entire match with its second group (\2 in the substitution pattern)
  • The domain should be at least 3rd level .com domain (every \+ repition means at least one match)
  • The search is case insensitive (i flag in the end)
  • It can do more than match per line (g flag in the end)
like image 184
sastanin Avatar answered Sep 20 '22 03:09

sastanin