Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

decoding base64 encoded text with POSIX awk

I need to decode a lot of base64-encoded text strings in awk; because I don't want to massively fork a non-portable base64 binary, I wrote an awk function for doing the decoding:

function base64_decode(str,    out,i,n,v) {
    out = ""
    if ( ! ("A" in _BASE64_DECODE_c2i) )
        for (i = 1; i <= 64; i++)
            _BASE64_DECODE_c2i[substr("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/",i,1)] = i-1
    i = 0
    n = length(str)
    while (i <= n) {
        v = _BASE64_DECODE_c2i[substr(str,++i,1)] * 262144 + \
            _BASE64_DECODE_c2i[substr(str,++i,1)] * 4096 + \
            _BASE64_DECODE_c2i[substr(str,++i,1)] * 64 + \
            _BASE64_DECODE_c2i[substr(str,++i,1)]
        out = out sprintf("%c%c%c", int(v/65536), int(v/256), v)
    }
    return out
}

Which works fine:

printf '%s\n' SmFuZQ== amRvZQ== |

LANG=C command -p awk '
    { print base64_decode($0) }
    function base64_decode(...) {...} # placeholder for the real function
'
Jane
jdoe

CONTEXTUAL PROBLEM

I' trying to get the givenName of the users that are members of GroupCode = 025496 from the output of ldapsearch -LLL -o ldif-wrap=no ... '(|(uid=*)(GroupCode=*))' uid givenName GroupCode memberUid, for example:

dn: uid=jsmith,ou=users,dc=example,dc=com
givenName: John
uid: jsmith

dn: uid=jdoe,ou=users,dc=example,dc=com
uid: jdoe
givenName:: SmFuZQ==

dn: cn=group1,ou=groups,dc=example,dc=com
GroupCode: 025496
memberUid:: amRvZQ==
memberUid: jsmith

Here would be an awk for doing so:

LANG=C command -p awk -F '\n' -v RS='' -v GroupCode=025496 '
    {
        delete attrs
        for (i = 2; i <= NF; i++) {
            match($i,/::? /)
            key = substr($i,1,RSTART-1)
            val = substr($i,RSTART+RLENGTH)
            if (RLENGTH == 3)
                val = base64_decode(val)
            attrs[key] = ((key in attrs) ? attrs[key] SUBSEP val : val)
        }
        if ( /\nuid:/ )
            givenName[ attrs["uid"] ] = attrs["givenName"]
        else
            memberUid[ attrs["GroupCode"] ] = attrs["memberUid"]
    }
    END {
        n = split(memberUid[GroupCode],uid,SUBSEP)
        for ( i = 1; i <= n; i++ )
            print givenName[ uid[i] ]
    }

    function base64_decode(...) { ... } # placeholder for the real function
'

On BSD and Solaris the result is:

Jane
John

While on Linux it is:


John

I don't understand where the issue is. Is there something wrong with the base64_decode function and/or the code that uses it?

like image 255
Fravadona Avatar asked Jun 03 '26 14:06

Fravadona


2 Answers

Your function generates NUL bytes when its argument (encoded string) ends with padding characters (=s). Below is a corrected version of your while loop:

while (i < n) {
    v = _BASE64_DECODE_c2i[substr(str,1+i,1)] * 262144 + \
        _BASE64_DECODE_c2i[substr(str,2+i,1)] * 4096 + \
        _BASE64_DECODE_c2i[substr(str,3+i,1)] * 64 + \
        _BASE64_DECODE_c2i[substr(str,4+i,1)]
    i += 4
    if (v%256 != 0)
        out = out sprintf("%c%c%c", int(v/65536), int(v/256), v)
    else if (int(v/256)%256 != 0)
        out = out sprintf("%c%c", int(v/65536), int(v/256))
    else
        out = out sprintf("%c", int(v/65536))
}

Note that if the decoded bytes contains an embedded NUL then this approach may not work properly.

like image 192
M. Nejat Aydin Avatar answered Jun 06 '26 05:06

M. Nejat Aydin


Problem is within base64_decode function that outputs some junk characters on gnu-awk.

You can use this awk code that uses system provided base64 utility as an alternative:

{
   delete attrs
   for (i = 2; i <= NF; i++) {
      match($i,/::? /)
      key = substr($i,1,RSTART-1)
      val = substr($i,RSTART+RLENGTH)
      if (RLENGTH == 3) {
         cmd = "echo " val " | base64 -di"
         cmd | getline val   # should also check exit code here
      }
      attrs[key] = ((key in attrs) ? attrs[key] SUBSEP val : val)
   }
   if ( /\nuid:/ )
      givenName[ attrs["uid"] ] = attrs["givenName"]
   else
      memberUid[ attrs["GroupCode"] ] = attrs["memberUid"]
}
END {
   n = split(memberUid[GroupCode],uid,SUBSEP)
   for ( i = 1; i <= n; i++ )
      print givenName[ uid[i] ]
}

I have tested this on gnu and BSD awk versions and I am getting expected output in all the cases.

If you cannot use external base64 utility then I suggest you take a look here for awk version of base64 decode.

like image 40
anubhava Avatar answered Jun 06 '26 06:06

anubhava



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!