Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

awk - split only by first occurrence

Tags:

I have a line like:

one:two:three:four:five:six seven:eight

and I want to use awk to get $1 to be one and $2 to be two:three:four:five:six seven:eight

I know I can get it by doing sed before. That is to change the first occurrence of : with sed then awk it using the new delimiter.

However replacing the delimiter with a new one would not help me since I can not guarantee that the new delimiter will not already be somewhere in the text.

I want to know if there is an option to get awk to behave this way

So something like:

awk -F: '{print $1,$2}'

will print:

one two:three:four:five:six seven:eight

I will also want to do some manipulations on $1 and $2 so I don't want just to substitute the first occurrence of :.

like image 370
Udy Avatar asked Oct 03 '13 09:10

Udy


3 Answers

Without any substitutions

echo "one:two:three:four:five" | awk -F: '{ st = index($0,":");print $1 "  " substr($0,st+1)}'

The index command finds the first occurance of the ":" in the whole string, so in this case the variable st would be set to 4. I then use substr function to grab all the rest of the string from starting from position st+1, if no end number supplied it'll go to the end of the string. The output being

one  two:three:four:five

If you want to do further processing you could always set the string to a variable for further processing.

rem = substr($0,st+1)

Note this was tested on Solaris AWK but I can't see any reason why this shouldn't work on other flavours.

like image 59
Adrian Avatar answered Oct 19 '22 05:10

Adrian


Some like this?

echo "one:two:three:four:five:six" | awk '{sub(/:/," ")}1' 
one two:three:four:five:six

This replaces the first : to space. You can then later get it into $1, $2

echo "one:two:three:four:five:six" | awk '{sub(/:/," ")}1' | awk '{print $1,$2}'
one two:three:four:five:six

Or in same awk, so even with substitution, you get $1 and $2 the way you like

echo "one:two:three:four:five:six" | awk '{sub(/:/," ");$1=$1;print $1,$2}'
one two:three:four:five:six

EDIT: Using a different separator you can get first one as filed $1 and rest in $2 like this:

echo "one:two:three:four:five:six seven:eight" | awk -F\| '{sub(/:/,"|");$1=$1;print "$1="$1 "\n$2="$2}'
$1=one
$2=two:three:four:five:six seven:eight

Unique separator

echo "one:two:three:four:five:six seven:eight" | awk -F"#;#." '{sub(/:/,"#;#.");$1=$1;print "$1="$1 "\n$2="$2}'
$1=one
$2=two:three:four:five:six seven:eight
like image 34
Jotne Avatar answered Oct 19 '22 05:10

Jotne


The closest you can get with is with GNU awk's FPAT:

$ awk '{print $1}' FPAT='(^[^:]+)|(:.*)' file
one

$ awk '{print $2}' FPAT='(^[^:]+)|(:.*)' file
:two:three:four:five:six seven:eight

But $2 will include the leading delimiter but you could use substr to fix that:

$ awk '{print substr($2,2)}' FPAT='(^[^:]+)|(:.*)' file
two:three:four:five:six seven:eight

So putting it all together:

$ awk '{print $1, substr($2,2)}' FPAT='(^[^:]+)|(:.*)' file
one two:three:four:five:six seven:eight

Storing the results of the substr back in $2 will allow further processing on $2 without the leading delimiter:

$ awk '{$2=substr($2,2); print $1,$2}' FPAT='(^[^:]+)|(:.*)' file
one two:three:four:five:six seven:eight

A solution that should work with mawk 1.3.3:

awk '{n=index($0,":");s=$0;$1=substr(s,1,n-1);$2=substr(s,n+1);print $1}' FS='\0'
one

awk '{n=index($0,":");s=$0;$1=substr(s,1,n-1);$2=substr(s,n+1);print $2}' FS='\0'
two:three:four five:six:seven

awk '{n=index($0,":");s=$0;$1=substr(s,1,n-1);$2=substr(s,n+1);print $1,$2}' FS='\0'
one two:three:four five:six:seven
like image 45
Chris Seymour Avatar answered Oct 19 '22 07:10

Chris Seymour