awk - split only by first occurrence

Question

I have a line like:

one:two:three:four:five:six seven:eight

and I want to use awk to get $1 to be one and $2 to be two:three:four:five:six seven:eight

I know I can get it by doing sed before. That is to change the first occurrence of : with sed then awk it using the new delimiter.

However replacing the delimiter with a new one would not help me since I can not guarantee that the new delimiter will not already be somewhere in the text.

I want to know if there is an option to get awk to behave this way

So something like:

awk -F: '{print $1,$2}'

will print:

one two:three:four:five:six seven:eight

I will also want to do some manipulations on $1 and $2 so I don't want just to substitute the first occurrence of :.

Adrian · Accepted Answer

Without any substitutions

echo "one:two:three:four:five" | awk -F: '{ st = index($0,":");print $1 "  " substr($0,st+1)}'

The index command finds the first occurance of the ":" in the whole string, so in this case the variable st would be set to 4. I then use substr function to grab all the rest of the string from starting from position st+1, if no end number supplied it'll go to the end of the string. The output being

one  two:three:four:five

If you want to do further processing you could always set the string to a variable for further processing.

rem = substr($0,st+1)

Note this was tested on Solaris AWK but I can't see any reason why this shouldn't work on other flavours.

Jotne · Answer

Some like this?

echo "one:two:three:four:five:six" | awk '{sub(/:/," ")}1' 
one two:three:four:five:six

This replaces the first : to space. You can then later get it into $1, $2

echo "one:two:three:four:five:six" | awk '{sub(/:/," ")}1' | awk '{print $1,$2}'
one two:three:four:five:six

Or in same awk, so even with substitution, you get $1 and $2 the way you like

echo "one:two:three:four:five:six" | awk '{sub(/:/," ");$1=$1;print $1,$2}'
one two:three:four:five:six

EDIT: Using a different separator you can get first one as filed $1 and rest in $2 like this:

echo "one:two:three:four:five:six seven:eight" | awk -F\| '{sub(/:/,"|");$1=$1;print "$1="$1 "
$2="$2}'
$1=one
$2=two:three:four:five:six seven:eight

Unique separator

echo "one:two:three:four:five:six seven:eight" | awk -F"#;#." '{sub(/:/,"#;#.");$1=$1;print "$1="$1 "
$2="$2}'
$1=one
$2=two:three:four:five:six seven:eight

Chris Seymour · Answer

The closest you can get with is with GNU awk's FPAT:

$ awk '{print $1}' FPAT='(^[^:]+)|(:.*)' file
one

$ awk '{print $2}' FPAT='(^[^:]+)|(:.*)' file
:two:three:four:five:six seven:eight

But $2 will include the leading delimiter but you could use substr to fix that:

$ awk '{print substr($2,2)}' FPAT='(^[^:]+)|(:.*)' file
two:three:four:five:six seven:eight

So putting it all together:

$ awk '{print $1, substr($2,2)}' FPAT='(^[^:]+)|(:.*)' file
one two:three:four:five:six seven:eight

Storing the results of the substr back in $2 will allow further processing on $2 without the leading delimiter:

$ awk '{$2=substr($2,2); print $1,$2}' FPAT='(^[^:]+)|(:.*)' file
one two:three:four:five:six seven:eight

A solution that should work with mawk 1.3.3:

awk '{n=index($0,":");s=$0;$1=substr(s,1,n-1);$2=substr(s,n+1);print $1}' FS='\0'
one

awk '{n=index($0,":");s=$0;$1=substr(s,1,n-1);$2=substr(s,n+1);print $2}' FS='\0'
two:three:four five:six:seven

awk '{n=index($0,":");s=$0;$1=substr(s,1,n-1);$2=substr(s,n+1);print $1,$2}' FS='\0'
one two:three:four five:six:seven

awk - split only by first occurrence

Tags:

Udy

3 Answers

Adrian

Jotne

Chris Seymour

Recent Activity

Donate For Us

awk - split only by first occurrence

Tags:

Udy

3 Answers

Adrian

Jotne

Chris Seymour

Related questions

Recent Activity

Donate For Us