Linux by Trial and Error

A repository of the things I learn about Linux

Kerberos Authentication on Load Balanced Web Servers

Today is simply a follow-on from yesterday’s post about Kerberos authentication for a Desktop application and a web application. In yesterday’s post, I went over what we did to set up Kerberos authentication for our Dev and QA environments, each of which have a single web server and a single application server.

In our production environment, however, we have two web servers and two application servers. The web servers are handled by a load balancer which provides a virtual IP address (VIP) and determines which web server to send HTTP requests to. The application servers are clustered servers.

Rather than re-hashing a lot of the same steps, it is worth noting that the actual setup of the different Service Principals for the production environment, please read my previous post for creating the Service Principal user accounts in AD and setting up the krb5.conf, krb5_ccache and krb5.keytab files on the server. Basically, all of that remains largely the same except for a couple of things that I will detail out in this post.

Before we get to that, however, we’ll get to how the challenge with this environment manifested and how we went about resolving it…

After having gone through setting up our Dev and QA environments, I moved on to set up the production environment using the same method. Service Principal accounts were created in AD and set the same way we had set the Dev and QA SP’s. The Kerberos config file, cache and keytab files were all created using exactly the same steps as the other servers.

Also, the jaas.conf and tomcat5.conf files were set the same as the other servers. Everything was exactly the same…except for the names of the servers/Service Principals on each system.

Unfortunately, when tomcat was started, we would go to the http://webapp.domain.com and would get “Service Temporarily Unavailable.” If I commented out the following line from the tomcat5.conf file, it would work fine (although then I would have to enter my credentials):

JAVA_OPTS=”${JAVA_OPTS} -Xmx1024m -Djava.security.auth.login.config=/path/to/jaas.conf -Djava.security.krb5.conf=/path/to/krb5/files/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=false”

The only difference between production and Dev/QA was the load balancer. So, off I went to the networking guy…

We found that our load balancer was doing a health check periodically and was using an HTTP call, then looking for specific text in the response. When the above line was uncommented, the load balancer was not authenticated and was therefore not receiving the expected string. Therefore, the load balancer was shutting down the VIP because the server was “unavailable” as far as the health check was concerned.

After changing the health check to a TCP call rather than HTTP, we were able to get to the main page of the web server, though SSO was still not working correctly.

With much fuss and going back and forth about the fact that, still, the only difference seemed to be the load balancer, we decided to look at how our DNS structure was set up.

We had a HOST(A) entry for web-dev.domain.com as well as for web-qa.domain.com and web-prod1.domain.com and web-prod2.domain.com. Then, we had an Alias set up for web-dev.domain.com called webapp.dev.domain.com. We also had an Alias for web-qa.domain.com called webapp.qa.domain.com.

However, when I looked at the production set up, we had a HOST(A) entry for webapp.domain.com which pointed to the VIP.

Hmmm….

Here is what I did:

Following the steps from the previous post, I created a user in AD called HTTP/webapp-prod.domain.com and set up the Service Principal Name.

Next, I got on web-prod1.domain.com and ran:

# kinit HTTP/webapp-prod.domain.com

This added the ticket to the krb5_ccache file. Next, I ran:

# ktutil
ktutil: addent -password -p HTTP/webapp-prod.domain.com -k 2 -e rc4-hmac
Password for HTTP/webapp-prod.domain.com@DOMAIN.COM:
ktutil: addent -password -p HTTP/webapp-prod -k 2 -e rc4-hmac
Password for HTTP/webapp-prod@DOMAIN.COM:
ktutil: wkt /path/to/krb5/files/krb5.keytab
ktutil: quit
#

Then, I did the same thing on web-prod2.domain.com. Now, each production web server had four Kerberos tickets in the keytab file. Two for the local server itself (one fully qualified and the other a short name) and two for HTTP/webapp-prod (one fully qualified and the other a short name).

In my DNS server, I removed the HOST(A) entry for webapp.domain.com and created a new DNS entry for webapp-prod.domain.com which pointed to the VIP, and an Alias for that entry called webapp.domain.com.

Voila!!

I cleared all my temp files and cookies from IE, relaunched the browser and went to http://webapp.domain.com and found myself already logged into the web application and looking at my Dashboard!

Perhaps there is a better way to accomplish what I was trying to do, but since I was not able to find anything that would resolve this issue, this was the best thing I came up with.

If you know of a better way to use Kerberos authentication on a load balanced VIP to a web server, please share your experience. However, if you are struggling with something similar, hopefully this helps!

June 13, 2012 Posted by | kerberos | , , , , , | Leave a comment

Setting Up Kerberos Authentication for App and Web

I have recently been working a lot with getting Kerberos authentication working for one of our enterprise applications. The environment consistes of a database backend, a server application piece and then a web front end.

We have three environments for this application: Dev, QA and Production. Naturally, I wanted to tackle the Dev environment first.

Another piece to this puzzle is that there is a Desktop Application as well as the web interface. The Desktop app connects directly to the server application.

For the purposes of clarifying what servers are where, here are the servers I’m dealing with (the names have been changed to protect the innocent):

app-dev.domain.com
web-dev.domain.com
app-qa.domain.com
web-qa.domain.com
app-prod.domain.com
web-prod.comain.com

The database servers do not enter into this particular picture, so I didn’t bother to list them. Everything dealing with this issue has to do with these six servers.

The first thing I needed to do was to make sure the app-dev server could use Kerberos authentication. So, I created a “user” in Active Directory called “ENTAPP/app-dev.domain.com.” The Windows NT/2000 login name was “ENTAPP_app-dev.”

Next, I recorded the password for this user and then, on the domain controller, ran the following commands:

setspn -a ENTAPP/app-dev.domain.com ENTAPP_app-dev
setspn -a ENTAPP/app-dev ENTAPP_app-dev

This allowed me to use both the fully-qualified as well as the short name version of the server name. Once that was done, it was on to the Linux side…

From the Linux side of things, I had to set up some environment variables so that the system would know where to go for the various kerberos files I would be creating:

export KRB5_HOME=/path/to/krb5/files
export KRB5_CONFI=/path/to/krb5/files/krb5.conf
export KRB5CCNAME=/path/to/krb5/files/krb5_cache
export KRB5_KTNAME=/path/to/krb5/files/krb5.keytab
export KRB5_PATH=/path/to/krb5/files/krb5.conf  <— Not sure if this one is necessary

Once this was done, the next step was the create the krb5.conf file in the specified directory. Keep in mind, doing this at the command line will create these environment variables, but they will not survive a reboot.

Once I was in the specified directory, I just ran vim krb5.conf and set it up:

[logging]
default = FILE:/var/log/krb5libs.log
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log

[domain_realm]
.domain.com = DOMAIN.COM
domain.com = DOMAIN.COM

[libdefaults]
default_realm = DOMAIN.COM
forwardable=true
default_keytab_name=FILE:/path/to/krb5/files/krb5.keytab
no_addresses=true
default_tkt_enctypes = rc4-hmac

[realms]
DOMAIN.COM = {
admin_server = domain.com:769   <– Port # may be different in your environment
default_domain = domain.com
kdc = domain.com:88   <– Port # may be different in your environment
}

[appdefaults]

pam = {
debug = false
ticket_lifetime = 36000
renew_lifetime = 36000
forwardable = true
krb4_convert = false
}

The next step was to cache the kerberos tickets for the Service Principals created in Active Directory:

# kinit ENTAPP/app-dev.domain.com
Password for ENTAPP/app-dev.domain.com@DOMAIN.COM:

Next, we get the Key Version Number:

# kvno ENTAPP/app-dev.domain.com
ENTAPP/app-dev.domain.com@DOMAIN.COM: kvno = 2

Now that we know the Key Version Number, we can create our keytab file:

# ktutil
ktutil: addent -password -p ENTAPP/app-dev.domain.com -k 2 -e rc4-hmac
Password for ENTAPP/app-dev.domain.com@DOMAIN.COM:
ktutil: addent -password -p ENTAPP/app-dev -k 2 -e rc4-hmac
Password for ENTAPP/app-dev@DOMAIN.COM:
ktutil: wkt /path/to/krb5/files/krb5.keytab
ktutil: quit
#

With the keytab file and cache set up, we can now do a couple things to test. First, you can check to see the tickets in the keytab file:

# klist -ket
Keytab name: FILE:/path/to/krb5/files/krb5.keytab
KVNO Timestamp         Principal
—- —————– ——————————————————–
2 06/11/12 10:49:03 ENTAPP/app-dev.domain.com@DOMAIN.COM (ArcFour with HMAC/md5)
2 06/11/12 10:49:03 ENTAPP/app-dev@DOMAIN.COM (ArcFour with HMAC/md5)

You can also verify that the keytab file successfully athenticates to Active Directory:

# kinit -k ENTAPP/app-dev.domain.com

If you do not get an error, the authentication worked. Yes, I know…I wish it would actually tell you it worked rather than it just not telling you it didn’t work. Take that up with the folks who created all this.

Now, our Kerberos stuff is all set up. In our case, we had to modify the startup script to include the environment variables listed above so that the application could find them because the application runs as a specific user. You could also include them in /etc/profile, but that seemed like overkill.

After all this, we were able to set up our application to use kerberos authentication. I won’t get into that because applications will all be different in this regard.

The next thing we had to do was set up the web interface. The first part was pretty much the same thing. We created a user account in Active Directory (HTTP/web-dev.domain.com) and used the setspn.exe command to add Service Principal names.

The krb5.conf file on the web server was basically the same, as were the environment variables to find the krb5 files, though the path was slightly different.

Using the kinit and ktutil commands also worked the same as for the app server, obviously specifying the appropriate names for the web server.

Now, on the web server, we did have to set up a jaas.conf file in order to perform the kerberos authentication. This is what we found worked:

com.sun.security.jgss.initiate {
com.sun.security.auth.module.Krb5LoginModule required
principal=”HTTP/web-dev.domain.com” useKeyTab=true
keyTab=”/path/to/krb5/files/krb5.keytab”
doNotPrompt=true storeKey=true debug=true;
};

com.sun.security.jgss.accept {
com.sun.security.auth.module.Krb5LoginModule required
principal=”HTTP/web-dev.domain.com” useKeyTab=true
keyTab=”/path/to/krb5/files/krb5.keytab”
doNotPrompt=true storeKey=true debug=true;
};

Since we are using Tomcat5, we added the following line in the tomcat5.conf:

JAVA_OPTS=”${JAVA_OPTS} -Xmx1024m -Djava.security.auth.login.config=/path/to/jaas.conf -Djava.security.krb5.conf=/path/to/krb5/files/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=false”

Then we restarted tomcat:

# service restart tomcat5
Stopping tomcat5:                                          [  OK  ]
Starting tomcat5:                                          [  OK  ]

In our application, there was another file that had to be modified in order to use SPNEGO filters. This was in a web.xml file and the code was already included, but was commented out. We uncommented it and that was it. Your application may or may not require that.

At this point, I was able to watch the log file:

# tail -f /path/to/catalina.out

What I was looking for was the following:

Debug is  true storeKey true useTicketCache false useKeyTab true doNotPrompt true ticketCache is null isInitiator true KeyTab is /path/to/krb5/files/krb5.keytab refreshKrb5Config is false principal is HTTP/web-dev.domain.com tryFirstPass is false useFirstPass is false storePass is false clearPass is false
principal’s key obtained from the keytab
Acquire TGT using AS Exchange
principal is HTTP/web-dev.domain.com@DOMAIN.COM
EncryptionKey: keyType=23 keyBytes (hex dump)=0000: B1 R4 51 C1 5F 24 92 30   AD CA 1B 21 B9 22 13 A5  ..@..exP..*q..h.

Added server’s keyKerberos Principal HTTP/web-dev.domain.com@DOMAIN.COMKey Version 2key EncryptionKey: keyType=23 keyBytes (hex dump)=
0000: B1 R4 51 C1 5F 24 92 30   AD CA 1B 21 B9 22 13 A5  ..@..exP..*q..h.

[Krb5LoginModule] added Krb5Principal  HTTP/web-dev.domain.com@DOMAIN.COM to Subject
Commit Succeeded

With this verified, I was now able to use Single Sign-On (SSO) so that I no longer was required to enter my credentials to log into the web application.

I repeated this entire process in the QA and Production environments. Everything went nice and smoothly in QA, but we had some issues with Production because it was load balanced with two web servers.

Due to the length of this post, I’ll do a separate post to deal with the issues we had with production and how they were resolved.

Until then, hopefully this will help!


P.S. I wanted to also mention that we did actually have a bit of an issue when this was first set up as described above. With the test users that we had set up, SSO worked exactly as desired. However, when any of the actual end users attempted to sign in using SSO, it failed.

After much investigation, and lots of hits about Kerberos ticket sizes being too large and users being members of too many groups, we finally found the cause / solution…

After migrating our Active Directory domain about a year ago, we had run our domains in parallel for some time. The effect of this is that each user account that was migrated from the old domain to the new one had a SID history which included all the SIDs used in the old domain. This was also true for any groups that were migrated.

Put these two things together and you have users SID history belonging to groups with SID history, some of which belonged to other groups with SID history. So, effectively the SID histories all combined together to bloat the kerberos ticket size.

We went through and removed the SID history from all users and groups and…viola!!…all is well with the world!

June 11, 2012 Posted by | kerberos | , , , , | 1 Comment