Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove one-character words

I'm looking for a regexp to remove one character words. I don't mind whether using perl, awk, sed or bash built-ins.

Test case:

$ echo "a b c d e f g h ijkl m n opqrst u v" | $COMMAND

Desired output:

ijkl opqrst

What I've tried so far:

$ echo "a b c d e f g h ijkl m n opqrst u v" | sed 's/ . //g'
acegijkln opqrstv

I'm guessing that:

  • the a isn't removed because there is no white space before it
  • the c remains because once the b has been removed, there is no more whitespace before it
  • and so on...

Attempt #2:

$ echo "a b c d e f g h ijkl m n opqrst u v" | sed 's/\w.\w//g'
     s v

Here I don't get at all what's happening.

Any help + explanations are welcome, I want to learn.

like image 224
nicoco Avatar asked Dec 08 '22 19:12

nicoco


1 Answers

You have to use word boundary \b (or) \< and \> respectively match the empty string at the beginning and end of a word.

echo "a b c d e f g h ijkl m n opqrst u v" | sed 's/\b\w\b \?//g'

(OR)

echo "a b c d e f g h ijkl m n opqrst u v" | sed 's/\<.\> \?//g'
like image 185
sat Avatar answered Jan 07 '23 18:01

sat