Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count the number of connected non-zeros along rows and columns but not diagonaly in a Matrix in shell script

I have a matrix. e.g.a 10 x 15 matrix

$ cat input.txt
2  3  4  5  10 0  2  2  0  1  0  0  0  1  0
0  3  4  6  2  0  2  0  0  0  0  1  2  3  40
0  0  0  2  3  0  3  0  3  1  2  3  1  0  0
1  2  0  4  0  3  4  0  4  1  2  0  0  1  1
0  0  0  0  0  0  10 3  4  12 4  5  12 3  10
26 3  4  5  10 0  2  2  4  1  0  0  0  1  0
0  30 4  6  2  0  2  0  0  0  0  1  2  3  40
0  0  0  2  3  0  0  0  0  1  0  3  1  0  0
1  2  10 4  0  0  4  0  4  1  2  0  0  1  1
0  0  0  0  0  0  0  3  0  0  4  5  0  3  10

I am looking for all connected non-zeros and their maximum value. Please see this figure.

enter image description here

So the desire output is:

outfile.txt
12 10    (12 is the yellow shaded connected non-zeros with maximum value 10)
42 40    (42 is the light-red shaded connected non-zeros with maximum value 40)
 1  1    (The light purple isolated non-zero)
 2  2    (The light green connected non-zeros)
15 30    and so on
 6  5
 1  4
 4 10 
 1  3

It is being difficult to develop a proper algorithm to write further it in fortran or in shell script. I am thinking of the following algorithm, but can't able to think what is next.

step 1: #Assign the entries with a[ij], i-row, j-column
        #Now make different non-zero connected cell arrays (e.g. c1[k],c2[k],c3[k],....etc) 
        for i in {1..10};do
          for j in {1..15};do
            if [ a(i,j) != 0 ];then c1[k]=a(i,j); a(i,j)="*" #(assign a(i,j) to c1[k] and 
                                                             #replace its original value to "*" 
                                                             #because it should not be considered further)
Step 2: #Now check the left-right-up-down elements of a(i,j), if non-zero
                if [ a(i,j-1) !=0 ] && [ a(i,j) !=* ]; then c1[k]=a(i,j-1); a(i,j-1)="*"
                if [ a(i,j+1) !=0 ] && [ a(i,j) !=* ]; then c1[k]=a(i,j+1); a(i,j+1)="*"
                if [ a(i-1,j) !=0 ] && [ a(i,j) !=* ]; then c1[k]=a(i-1,j); a(i-1,j)="*"
                if [ a(i+1,j) !=0 ] && [ a(i,j) !=* ]; then c1[k]=a(i+1,j); a(i+1,j)="*"

Step 3: #continue the same process for each non-zeros until a zero at all-end.

Step 4: #Count the number of elements in c1[k] and find it maximum
like image 696
Kay Avatar asked Nov 29 '19 04:11

Kay


1 Answers

Here is one in GNU awk:

awk '
function check(x,y,s,   v) {                        # recursively check neighbours
    v=m[x][y]                                       # store val
    delete m[x][y]                                  # del to avoid loops
    if(((x+1) in m) && (y in m[x+1]) && m[x+1][y])  # test if neighbour exists
        check(x+1,y,s)                              # and check it
    if((x in m) && ((y+1) in m[x]) && m[x][y+1])
        check(x,y+1,s)
    if(((x-1) in m) && (y in m[x-1]) && m[x-1][y])
        check(x-1,y,s)
    if((x in m) && ((y-1) in m[x]) && m[x][y-1])
        check(x,y-1,s)
    if(v>max[s])
        max[s]=v                                    # keep max
    set[s]++                                        # count set size
}
{
    for(x=1;x<=NF;x++)                              # hash values
        m[x][NR]=$x
}
END {
    for(x in m)                                     # in no particular order
        for(y in m[x])                              
            if(m[x][y])
                check(x,y,++s)                      # start checking
    for(i in set)                                   # output
        print set[i],max[i]
}' file

Output:

12 10
2 2
15 30
42 40
1 4
1 3
6 5
1 1
4 10
like image 66
James Brown Avatar answered Oct 11 '22 07:10

James Brown