Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove leading and trailing numbers from string, while leaving 2 numbers, using sed or awk

Tags:

regex

sed

awk

I have a file containing lines like:

353451word2423157
anotherword
7412yetanother1
3262andherese123anotherline4359013
5342512354325324523andherese123anotherline45913
532453andherese123anotherline413

I'd like to strip most of the leading and tailing numbers (0-9), while still leaving 2 leading and trailing numbers in place, if any...

To clarify, for the list above, the expected output would be:

51word24
anotherword
12yetanother1
62andherese123anotherline43
23andherese123anotherline45
53andherese123anotherline41

Preferred tools would be sed or awk, but any other suggestions are welcome...

I've tried something like sed 's/[0-9]\+$//' | sed 's/^[0-9]\+//', but obviously this strips all leading and trailing numbers...

like image 888
user16343284 Avatar asked Dec 23 '22 15:12

user16343284


1 Answers

You may try this sed:

sed -E 's/^[0-9]+([0-9]{2})|([0-9]{2})[0-9]+$/\1\2/g' file

51word24
anotherword
12yetanother1
62andherese123anotherline43
23andherese123anotherline45
53andherese123anotherline41

Command Details:

  • ^[0-9]+([0-9]{2}): Match 1+ digits at start if that is followed by 2 digits (captured in a group) and replace with 2 digits in group #1.
  • ([0-9]{2})[0-9]+$: Match 1+ digits at the end if that is preceded by 2 digits (captured in a group) and replace with 2 digits in group #2.
like image 55
anubhava Avatar answered Jun 02 '23 20:06

anubhava