Remove everything after 2nd occurrence in a string in unix

Question

I would like to remove everything after the 2nd occurrence of a particular pattern in a string. What is the best way to do it in Unix? What is most elegant and simple method to achieve this; sed, awk or just unix commands like cut?

My input would be

After-u-math-how-however

Output should be

After-u

Everything after the 2nd - should be stripped out. The regex should also match zero occurrences of the pattern, so zero or one occurrence should be ignored and from the 2nd occurrence everything should be removed.

So if the input is as follows

After

Output should be

After

Evan Purkhiser · Accepted Answer

Something like this would do it.

echo "After-u-math-how-however" | cut -f1,2 -d'-'

This will split up (cut) the string into fields, using a dash (-) as the delimiter. Once the string has been split into fields, cut will print the 1st and 2nd fields.

potong · Answer

This might work for you (GNU sed):

sed 's/-[^-]*//2g' file

Steve · Answer

You could use the following regex to select what you want:

^[^-]*-\?[^-]*

For example:

echo "After-u-math-how-however" | grep -o "^[^-]*-\?[^-]*"

Results:

After-u

Ed Morton · Answer

@EvanPurkisher's cut -f1,2 -d'-' solution is IMHO the best one but since you asked about sed and awk:

With GNU sed for -r

$ echo "After-u-math-how-however" | sed -r 's/([^-]+-[^-]*).*/\1/'
After-u

With GNU awk for gensub():

$ echo "After-u-math-how-however" | awk '{$0=gensub(/([^-]+-[^-]*).*/,"\1","")}1'
After-u

Can be done with non-GNU sed using \( and *, and with non-GNU awk using match() and substr() if necessary.

mklement0 · Answer

awk -F - '{print $1 (NF>1? FS $2 : "")}' <<<'After-u-math-how-however'

Split the line into fields based on field separator - (option spec. -F -) - accessible as special variable FS inside the awk program.
Always print the 1st field (print $1), followed by:
- If there's more than 1 field (NF>1), append FS (i.e., -) and the 2nd field ($2)
- Otherwise: append "", i.e.: effectively only print the 1st field (which in itself may be empty, if the input is empty).

Remove everything after 2nd occurrence in a string in unix

Tags:

regex

bash

unix

sed

awk

Jose

5 Answers

Evan Purkhiser

potong

Steve

Ed Morton

mklement0

Recent Activity

Donate For Us

Remove everything after 2nd occurrence in a string in unix

Tags:

regex

bash

unix

sed

awk

Jose

5 Answers

Evan Purkhiser

potong

Steve

Ed Morton

mklement0

Related questions

Recent Activity

Donate For Us