Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Change any number of delimiters in found pattern with sed

Tags:

regex

sed

awk

perl

I want to change every . to @.@ with sed, but only if the . is enclosed with numbers.
For example:

This sentence ends with a dot. 1.2.3
Dot. 1.2.3.4.5 Dot.

The goal:

This sentence ends with a dot. 1 @.@ 2 @.@ 3
Dot. 1 @.@ 2 @.@ 3 @.@ 4 @.@ 5 Dot.

The pattern could contain any number of integers.

I tried:

sed -E 's/([0-9]+)\.([0-9]+)/\1 @\.@ \2/g'

but it only works for the first two number in the pattern.

like image 631
sedsed Avatar asked Mar 01 '21 08:03

sedsed


People also ask

Can you use regex with sed?

Regular expressions are used by several different Unix commands, including ed, sed, awk, grep, and to a more limited extent, vi.

How do you use sed to match word and perform find and replace?

Find and replace text within a file using sed command Use Stream EDitor (sed) as follows: sed -i 's/old-text/new-text/g' input.txt. The s is the substitute command of sed for find and replace. It tells sed to find all occurrences of 'old-text' and replace with 'new-text' in a file named input.txt.

How do you use sed multiple times?

You can tell sed to carry out multiple operations by just repeating -e (or -f if your script is in a file). sed -i -e 's/a/b/g' -e 's/b/d/g' file makes both changes in the single file named file , in-place.

How can sed be used to identify a pattern?

1. Basic text substitution using 'sed' Any particular part of a text can be searched and replaced by using searching and replacing pattern by using `sed` command. In the following example, 's' indicates the search and replace task.

How to add string before and after the matching pattern using SED?

Add string before and after the matching pattern using ‘\1’ The sequence of matching patterns of `sed` command is denoted by ‘\1’, ‘\2’ and so on. The following `sed` command will search the pattern, ‘Bash’ and if the pattern matches then it will be accessed by ‘\1′ in the part of replacing text.

How to replace all occurrences of the search pattern in SED?

With the global replacement flag sed replaces all occurrences of the search pattern: As you might have noticed, the substring foo inside the foobar string is also replaced in the previous example. If this is not the wanted behavior, use the word-boundary expression ( \b) at both ends of the search string.

How to replace everything after the match in sed command?

The following ` sed ` command shows the use of ‘ c ‘ to replace everything after the match. Here, ‘ c ‘ indicates the change. The command will search the word ‘ present ‘ in the file and replace everything of the line with the text, ‘ This line is replaced ‘ if the word exists in any line of the file.

How to find and replace the delimiter character in a string?

123 linux linux linux linux /bin/bash Ubuntu linuxbar 456 If you want to find and replace a string that contains the delimiter character (/) you’ll need to use the backslash () to escape the slash. For example to replace /bin/bash with /usr/bin/zsh you would use sed -i 's//bin/bash//usr/bin/zsh/g' file.txt


Video Answer


2 Answers

For the repeated pattern (number-dot-number-dot-number...) that substitution doesn't work because the number following the dot is "consumed" and so the engine moved along the string, so the next character it sees is a dot, not the needed num-dot-num pattern.

One solution is to use lookarounds, which are "zero-width" assertions, so with which the engine doesn't consume the match and doesn't move along, but it merely "looks" from its "spot" between characters to assert that the pattern (ahead or behind) matches, so to say

s/ (?<=[0-9]) \. (?=[0-9]) / @.@ /gx;

For a testable example (in Perl, as tagged)

perl -wE'$_=q(Dot. 1.2.3.4.5 Dot.); say; s/(?<=[0-9])\.(?=[0-9])/ @.@ /g; say'

which prints

Dot. 1.2.3.4.5 Dot.
Dot. 1 @.@ 2 @.@ 3 @.@ 4 @.@ 5 Dot.

But the lookbehind won't work with a "number" that consists of more than one digit, since then we'd need [0-9]+ which has variable and unlimited length, whiat lookbehinds can't (yet) do.

If it is indeed possible to have multi-digit numbers in your case, then the number before the . need be captured -- this still works with the number before the dot -- and then put back

s/([0-9]+)\.(?=[0-9])/$1 @.@ /g;

This can be done anyway, of course, even if it's all always single digits; i used lookbehind originally only for the symmetry with the other side (needing a lookahead)


In a tool that supports them, which in my understanding sed isn't. (Thanks to comments by potong and Ed Morton for informing of that) I still offer this solution since Perl is one of the tagged languages.

like image 72
zdim Avatar answered Sep 22 '22 16:09

zdim


As for the 1st line, the regex matches 1.2 for the 1st trial. The next pattern match starts with the character . just after the previous match then it fails.
With sed please try:

sed -E '
:l
s/([[:digit:]])\.([[:digit:]])/\1 @.@ \2/
t l
' file

which iterates the pattern match from the start of the string.

As you are adding perl in the tag, here is an alternative with perl:

perl -pe 's/(?<=\d)\.(?=\d)/ @.@ /g' file
like image 37
tshiono Avatar answered Sep 21 '22 16:09

tshiono