I have a string, for example home/JOHNSMITH-4991-common-task-list
, and I want to take out the uppercase part and the numbers with the hyphen between them. I echo the string and pipe it to sed like so, but I keep getting all the hyphens I don't want, e.g.:
echo home/JOHNSMITH-4991-common-task-list | sed 's/[^A-Z0-9-]//g'
gives me:
JOHNSMITH-4991---
I need:
JOHNSMITH-4991
How do I ignore all but the first hyphen?
You can use
sed 's,.*/\([^-]*-[^-]*\).*,\1,'
POSIX BRE regex details:
.*
- any zero or more chars/
- a /
char\([^-]*-[^-]*\)
- Group 1: any zero or more chars other than -
, a hyphen, and then again zero or more chars other than -
.*
- any zero or more charsThe replacement is the Group 1 placeholder, \1
, to restore just the text captured.
See the online demo:
#!/bin/bash
s="home/JOHNSMITH-4991-common-task-list"
sed 's,.*/\([^-]*-[^-]*\).*,\1,' <<< "$s"
# => JOHNSMITH-4991
1st solution: With awk
it will be much easier and we could keep it simple, please try following, written and tested with your shown samples.
echo "echo home/JOHNSMITH-4991-common-task-list" | awk -F'/|-' '{print $2"-"$3}'
Explanation: Simple explanation would be, setting field separator as /
OR -
and printing 2nd field -
and 3rd field of current line.
2nd solution: Using match
function of awk
program here.
echo "echo home/JOHNSMITH-4991-common-task-list" |
awk '
match($0,/\/[^-]*-[^-]*/){
print substr($0,RSTART+1,RLENGTH-1)
}'
3rd solution: Using GNU grep
solution here. Using -oP
option of grep
here, to print matched values with o option and to enable ERE(extended regular expression) with P
option. Then in main program of grep
using .*/
followed by \K
to ignore previous matched part and then mentioning [^-]*-[^-]*
to make sure to get values just before 2nd occurrence of -
in matched line.
echo "echo home/JOHNSMITH-4991-common-task-list" | grep -oP '.*/\K[^-]*-[^-]*'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With