Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

KrbException connecting to Hadoop cluster with Zookeeper client - UNKNOWN_SERVER

My Zookeeper client is having trouble connecting to the Hadoop cluster.

This works fine from a Linux VM, but I am using a Mac.

I set the -Dsun.security.krb5.debug=true flag on the JVM and get the following output:

Found ticket for [email protected] to go to krbtgt/[email protected] expiring on Sat Apr 29 03:15:04 BST 2017
Entered Krb5Context.initSecContext with state=STATE_NEW
Found ticket for [email protected] to go to krbtgt/[email protected] expiring on Sat Apr 29 03:15:04 BST 2017
Service ticket not found in the subject
>>> Credentials acquireServiceCreds: same realm
Using builtin default etypes for default_tgs_enctypes
default etypes for default_tgs_enctypes: 17 16 23.
>>> CksumType: sun.security.krb5.internal.crypto.RsaMd5CksumType
>>> EType: sun.security.krb5.internal.crypto.Aes128CtsHmacSha1EType
>>> KrbKdcReq send: kdc=oc-10-252-132-139.nat-ucfc2z3b.usdv1.mycloud.com UDP:88, timeout=30000, number of retries =3, #bytes=682
>>> KDCCommunication: kdc=oc-10-252-132-139.nat-ucfc2z3b.usdv1.mycloud.com UDP:88, timeout=30000,Attempt =1, #bytes=682
>>> KrbKdcReq send: #bytes read=217
>>> KdcAccessibility: remove oc-10-252-132-139.nat-ucfc2z3b.usdv1.mycloud.com
>>> KDCRep: init() encoding tag is 126 req type is 13
>>>KRBError:
     cTime is Thu Dec 24 11:18:15 GMT 2015 1450955895000
     sTime is Fri Apr 28 15:15:06 BST 2017 1493388906000
     suSec is 925863
     error code is 7
     error Message is Server not found in Kerberos database
     cname is [email protected]
     sname is zookeeper/[email protected]
     msgType is 30
KrbException: Server not found in Kerberos database (7) - UNKNOWN_SERVER
    at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:73)
    at sun.security.krb5.KrbTgsReq.getReply(KrbTgsReq.java:251)
    at sun.security.krb5.KrbTgsReq.sendAndGetCreds(KrbTgsReq.java:262)
    at sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:308)
    at sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:126)
    at sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:458)
    at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:693)
    at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:248)
    at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
    at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
    at org.apache.zookeeper.client.ZooKeeperSaslClient$2.run(ZooKeeperSaslClient.java:366)
    at org.apache.zookeeper.client.ZooKeeperSaslClient$2.run(ZooKeeperSaslClient.java:363)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.zookeeper.client.ZooKeeperSaslClient.createSaslToken(ZooKeeperSaslClient.java:362)
    at org.apache.zookeeper.client.ZooKeeperSaslClient.createSaslToken(ZooKeeperSaslClient.java:348)
    at org.apache.zookeeper.client.ZooKeeperSaslClient.sendSaslPacket(ZooKeeperSaslClient.java:420)
    at org.apache.zookeeper.client.ZooKeeperSaslClient.initialize(ZooKeeperSaslClient.java:458)
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1057)
Caused by: KrbException: Identifier doesn't match expected value (906)
    at sun.security.krb5.internal.KDCRep.init(KDCRep.java:140)
    at sun.security.krb5.internal.TGSRep.init(TGSRep.java:65)
    at sun.security.krb5.internal.TGSRep.<init>(TGSRep.java:60)
    at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:55)
    ... 18 more
ERROR   2017-04-28 15:15:07,046 5539    org.apache.zookeeper.client.ZooKeeperSaslClient [main-SendThread(oc-10-252-132-160.nat-ucfc2z3b.usdv1.mycloud.com:2181)]    
An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed 
[Caused by GSSException: No valid credentials provided 
(Mechanism level: Server not found in Kerberos database (7) - UNKNOWN_SERVER)]) 
occurred when evaluating Zookeeper Quorum Member's  received SASL token. 
This may be caused by Java's being unable to resolve the Zookeeper Quorum Member's hostname correctly. 
You may want to try to adding '-Dsun.net.spi.nameservice.provider.1=dns,sun' to your client's JVMFLAGS environment. 
Zookeeper Client will go to AUTH_FAILED state.

I've tested Kerberos config as follows:

>kinit -kt /etc/security/keytabs/solr.headless.keytab solr
>klist
Credentials cache: API:3451691D-7D5E-49FD-A27C-135816F33E4D
        Principal: [email protected]

  Issued                Expires               Principal
Apr 28 16:58:02 2017  Apr 29 04:58:02 2017  krbtgt/[email protected]

Following the instructions from hortonworks I managed to get the kerberos ticket stored in a file:

 >klist -c FILE:/tmp/krb5cc_501
Credentials cache: FILE:/tmp/krb5cc_501
    Principal: [email protected]

Issued                Expires               Principal
Apr 28 17:10:25 2017  Apr 29 05:10:25 2017  krbtgt/[email protected]

Also I tried the suggested JVM option suggested in the stack trace (-Dsun.net.spi.nameservice.provider.1=dns,sun), but this led to a different error along the lines of Client session timed out, which suggests that this JVM param is preventing the client from connecting correctly in the first place.

==EDIT==

Seems that the Mac version of Kerberos is not the latest:

> krb5-config --version
Kerberos 5 release 1.7-prerelease

I just tried brew install krb5 to install a newer version, then adjusting the path to point to the new version.

> krb5-config --version
Kerberos 5 release 1.15.1

This has had no effect whatsoever on the outcome.

NB this works fine from a linux VM on my Mac, using exactly the same jaas.conf, keytab files, and krb5.conf.

krb5.conf:

[libdefaults]
renew_lifetime = 7d
forwardable = true
  default_realm = DDA.MYCO.COM
  ticket_lifetime = 24h
  dns_lookup_realm = false
  dns_lookup_kdc = false


[realms]
  DDA.MYCO.COM = {
    admin_server = oc-10-252-132-139.nat-ucfc2z3b.usdv1.mycloud.com
    kdc = oc-10-252-132-139.nat-ucfc2z3b.usdv1.mycloud.com
  }

Reverse DNS: I checked that the FQDN hostname I'm connecting to can be found using a reverse DNS lookup:

> host 10.252.132.160
160.132.252.10.in-addr.arpa domain name pointer oc-10-252-132-160.nat-ucfc2z3b.usdv1.mycloud.com.

This is exactly as per the response to the same command from the linux VM.

===WIRESHARK ANALYSIS===

Using Wireshark configured to use the system key tabs allows a bit more detail in the analysis.

Here I have found that a failed call looks like this:

client -> host  AS-REQ
host -> client  AS-REP 
client -> host  AS-REQ
host -> client  AS-REP 
client -> host TGS-REQ  <-- this call is detailed below
host -> client KRB error KRB5KDC_ERR_S_PRINCIPAL_UNKNOWN

The erroneous TGS-REQ call shows the following:

Kerberos
tgs-req
    pvno: 5
    msg-type: krb-tgs-req (12)
    padata: 1 item
    req-body
        Padding: 0
        kdc-options: 40000000 (forwardable)
        realm: DDA.MYCO.COM
        sname
            name-type: kRB5-NT-UNKNOWN (0)
            sname-string: 2 items
                SNameString: zookeeper
                SNameString: oc-10-252-134-51.nat-ucfc2z3b.usdv1.mycloud.com
        till: 1970-01-01 00:00:00 (UTC)
        nonce: 797021964
        etype: 3 items
            ENCTYPE: eTYPE-AES128-CTS-HMAC-SHA1-96 (17)
            ENCTYPE: eTYPE-DES3-CBC-SHA1 (16)
            ENCTYPE: eTYPE-ARCFOUR-HMAC-MD5 (23)

Here is the corresponding successful call from the linux box, which is followed by several more exchanges.

Kerberos
    tgs-req
        pvno: 5
        msg-type: krb-tgs-req (12)
        padata: 1 item
        req-body
            Padding: 0
            kdc-options: 40000000 (forwardable)
            realm: DDA.MYCO.COM
            sname
                name-type: kRB5-NT-UNKNOWN (0)
                sname-string: 2 items
                    SNameString: zookeeper
                    SNameString: d59407.ddapoc.ucfc2z3b.usdv1.mycloud.com
            till: 1970-01-01 00:00:00 (UTC)
            nonce: 681936272
            etype: 3 items
                ENCTYPE: eTYPE-AES128-CTS-HMAC-SHA1-96 (17)
                ENCTYPE: eTYPE-DES3-CBC-SHA1 (16)
                ENCTYPE: eTYPE-ARCFOUR-HMAC-MD5 (23)

So it looks like the client is sending

oc-10-252-134-51.nat-ucfc2z3b.usdv1.mycloud.com

as the server host, when it should be sending:

d59407.ddapoc.ucfc2z3b.usdv1.mycloud.com

So the question is, how do I fix that? Bear in mind this is a Java piece of code.

My /etc/hosts has the following:

10.252.132.160 b3e073.ddapoc.ucfc2z3b.usdv1.mycloud.com
10.252.134.51  d59407.ddapoc.ucfc2z3b.usdv1.mycloud.com
10.252.132.139 d7cc18.ddapoc.ucfc2z3b.usdv1.mycloud.com

And my krb5.conf file has:

kdc = d7cc18.ddapoc.ucfc2z3b.usdv1.mycloud.com
kdc = b3e073.ddapoc.ucfc2z3b.usdv1.mycloud.com
kdc = d59407.ddapoc.ucfc2z3b.usdv1.mycloud.com

I tried adding -Dsun.net.spi.nameservice.provider.1=file,dns as a JVM param but got the same result.

like image 705
mdarwin Avatar asked Oct 30 '22 08:10

mdarwin


1 Answers

I fixed this by setting up a local dnsmasq instance to supply the forward and reverse DNS lookups.

So now from the command line, host d59407.ddapoc.ucfc2z3b.usdv1.mycloud.com returns 10.252.134.51

See also here and here.

like image 108
mdarwin Avatar answered Nov 15 '22 10:11

mdarwin