I have this (demo) text in the variable ArtTEXT.
{1}: Reporting Problems and Bugs.
{2}: Other freely available awk implementations.
{3}: Summary of installation.
{4}: How to disable certain gawk extensions.
{5}: Making Additions To gawk.
{6}: Accessing the Git repository.
{7}: Adding code to the main body of gawk.
{8}: Porting gawk to a new operating system.
{9}: Why derived files are kept in the Git repository.
It is a one variable where the lines are delimited with an indent.
indent = "\n\t\t\t";
I want to loop through the lines and check something in each line.
So I split it into an array using the indent
split(ArtTEXT,lin, indent);
Then I loop through the array lin
l = 0;
for (l in lin) {
print "l -- ", l, " lin[l] -- " ,lin[l] ;
}
What I get are the lines of ArtTEXT beginning in line #4
l -- 4 lin[l] -- {3}: Summary of installation.
l -- 5 lin[l] -- {4}: How to disable certain gawk extensions.
l -- 6 lin[l] -- {5}: Making Additions To gawk.
l -- 7 lin[l] -- {6}: Accessing the Git repository.
l -- 8 lin[l] -- {7}: Adding code to the main body of gawk.
l -- 9 lin[l] -- {8}: Porting gawk to a new operating system.
l -- 10 lin[l] -- {9}: Why derived files are kept in the Git repository.
l -- 1 lin[l] --
l -- 2 lin[l] -- {1}: Reporting Problems and Bugs.
l -- 3 lin[l] -- {2}: Other freely available awk implementations.
(The original text has an empty line at the beginning.)
The manual says about the split function:
The first piece is stored in array[1], the second piece in array[2], and so forth.
How do I avoid this problem?
Why is this happening?
Thanks.
In awk, arrays are unordered. If they happen to come out in order, it is accidental.
In GNU awk, it is possible to control the order. For example to get numerical ordering by indices, use PROCINFO["sorted_in"]="@ind_num_asc":
$ awk -v ArtTEXT="$(cat file)" 'BEGIN{PROCINFO["sorted_in"]="@ind_num_asc"; indent="\n\t\t\t"; split(ArtTEXT, lin, indent); for (l in lin) print "l -- ", l, " lin[l] -- " ,lin[l] ;}'
l -- 1 lin[l] -- {1}: Reporting Problems and Bugs.
l -- 2 lin[l] -- {2}: Other freely available awk implementations.
l -- 3 lin[l] -- {3}: Summary of installation.
l -- 4 lin[l] -- {4}: How to disable certain gawk extensions.
l -- 5 lin[l] -- {5}: Making Additions To gawk.
l -- 6 lin[l] -- {6}: Accessing the Git repository.
l -- 7 lin[l] -- {7}: Adding code to the main body of gawk.
l -- 8 lin[l] -- {8}: Porting gawk to a new operating system.
l -- 9 lin[l] -- {9}: Why derived files are kept in the Git repository.
Alternatively, since the array indices are numerical, we can loop numerically, using for (l=1;l<=length(lin);l++) print...:
$ awk -v ArtTEXT="$(cat file)" 'BEGIN{indent="\n\t\t\t"; split(ArtTEXT, lin, indent); for (l=1;l<=length(lin);l++) print "l -- ", l, " lin[l] -- " ,lin[l] ;}'
l -- 1 lin[l] -- {1}: Reporting Problems and Bugs.
l -- 2 lin[l] -- {2}: Other freely available awk implementations.
l -- 3 lin[l] -- {3}: Summary of installation.
l -- 4 lin[l] -- {4}: How to disable certain gawk extensions.
l -- 5 lin[l] -- {5}: Making Additions To gawk.
l -- 6 lin[l] -- {6}: Accessing the Git repository.
l -- 7 lin[l] -- {7}: Adding code to the main body of gawk.
l -- 8 lin[l] -- {8}: Porting gawk to a new operating system.
l -- 9 lin[l] -- {9}: Why derived files are kept in the Git repository.
The GNU code shown over multiple lines looks like:
awk -v ArtTEXT="$(cat file)" '
BEGIN{
PROCINFO["sorted_in"]="@ind_num_asc"
indent="\n\t\t\t"
split(ArtTEXT, lin, indent)
for (l in lin)
print "l -- ", l, " lin[l] -- " ,lin[l]
}'
And, the alternate code is:
awk -v ArtTEXT="$(cat file)" '
BEGIN{
indent="\n\t\t\t"
split(ArtTEXT, lin, indent)
for (l=1;l<=length(lin);l++)
print "l -- ", l, " lin[l] -- " ,lin[l]
}'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With