To most people, “radius” is something to do with circles. For network folks, RADIUS refers to Remote Authentication Dial-In User Service. It is an authentication service for controlling user access to various network services like, for example, our enterprise wireless. We had significant problems with our wireless at NUS this week, so I think it might be interesting to share a little about what goes on under the hood.
RADIUS is an AAA protocol/service. The AAA here is an acronym for Authentication, Authorisation, and Accounting. That’s right, RADIUS is a whole lot more than simply authentication. Authentication is only concerned about identifying a user, typically by using a userid and password. Authorisation is the next step, determining exactly what privileges, or what kind of service, the user is entitled to use or access.
In the old days of dial-up modems, RADIUS enables network operators to say that some users can dial-up with text-based serial service only, or make use of SLIP/PPP. SLIP/PPP provided IP-level connectivity, what today you’d call the Internet. Or broadband. Oh, it wasn’t that fast, of course, back then. Quotas or other policies can be carried out through the authorisation mechanism.
Accounting is the last bit in AAA. It is about recording what was used by whom, and how much was used. It enabled network operators to bill users. Those who’ve been through the ages of dial-up Internet, and in particular, travelled around the world with iPass roaming, you’ll see the work of RADIUS showing up in your bills. (For those who don’t know, iPass isn’t some i-thing from Apple.)
The problem that has been plaguing wireless in NUS of late stems from a RADIUS authentication failure. The wireless controllers are going to RADIUS with an authentication request, and the RADIUS servers are randomly returning authentication failure responses. The result is that users are randomly denied access to the wireless network.
We, as in like me, were aware of the difficulties quite early on. I use wireless all the time. My main work computer is a MacBook Pro. I may have an office, but I don’t use the wired network. (The wired network is there for my Windows desktop, which I only use because of some enterprise apps that just require a Windows to work.) We could quite clearly see what was the immediate cause of the problem, but to get the matter escalated to the right people and getting the root cause remedied is a whole different challenge altogether.
To cut a long story short, the authentication problems were outside of SoC, and it wasn’t something that we can fix. We focused, instead, on developing a workaround. Instead of authentication users against NUSNET accounts, we planned on authenticating against SoC accounts.
We are already running a cluster of RADIUS servers. In fact, all our wireless controllers are configured to use our RADIUS clusters. Authentication of NUSNET accounts are simply proxied to NUSNET authentication servers. In principle, switching to SoC accounts isn’t too much work.
There are, however, some challenges for SoC to run its own RADIUS for wireless authentication. The reason is this. Enterprise wireless like ours make use of 802.1X PEAP with an MS-CHAPv2 inner-protocol for user authentication. The nature of MS-CHAPv2 requires that the plaintext password (or NT password hash) is known to the RADIUS server.
SoC accounts are stored in an LDAP server. That in itself is not a problem, because RADIUS servers can communicate with an LDAP backend for account information. The challenge, however, is that our passwords are stored hashed in a Unix format. MS-CHAPv2 cannot make use of these hashes. At best, MS-CHAPv2 can make do with the NT password hash, but we are also unable to produce an NT password hash from other hash formats.
The only solution is for us to get new passwords. We could either have users “change” their passwords, and during the process generate two hashes, one in Unix style for our other regular LDAP authentications, and another in NT hash format for RADIUS MS-CHAPv2 purposes. Alternatively, we could have users create new passwords solely for RADIUS MS-CHAPv2 authentication. There are quite a bit of issues to be considered, but I’ll just cut to the conclusion that we decided to go with a new, separate, password to be used solely for wireless authentication.
We also decided to provision a new wireless network SSID. Since the accounts used for authentication has changed, it would be inconvenient to re-use the same SSID, because then the wireless network SSID inside SoC would have to use one account, while the exact same wireless network SSID outside SoC would have to use another account. At best, this would be inconvenient for users who have to keep re-entering their account credentials whenever they move back and forth between different areas.
We learnt a few things, unexpectedly, during this exercise. Many of them are unrelated to the issue of authentication, but there’s one interesting bit that is. With 802.1x PEAP, the RADIUS server needs to present a TLS (i.e. SSL) certificate. We originally used a wildcard certificate. Android, iOS and Mac OS X clients had no trouble authenticating in this new network. We learnt, however, that Windows clients could not work. It turns out that in Windows, wildcard certificates are not allowed for EAP authentications. Windows will allow wildcard certificates in general, just not for EAP authentication purposes. Weird rule. It’s not documented. The issue only gets mentioned in support forums. Now, those others who are reading this because your freeradius log says:
eap_peap: SSL says: TLS Alert read:fatal:access denied
That error basically means the client refuses to talk to you. If you used a wildcard certificate, and the client is Windows, well, now you know, go get a non-wildcard certificate. Oh, and your certificate needs to have “Server Authentication ( 1.3.6.1.5.5.7.3.1 )” OID under the Extended Key Usage certificate extension, though this is usually not a problem.
Our backup network was ready-to-go.
Just for the interest of people who want to know more about enterprise wireless authentication, here’s a bit more information so you understand how it’s different from something like WPA2 PSK you’d use in your home wireless router.
The first thing, of course, is that you authenticate with an account, not a pre-shared secret. The wireless access point or wireless controller merely passes through the authentication from user device to the RADIUS server. Thus, you are authenticating with the RADIUS server. The wireless network doesn’t get to see your password.
An important benefit is that the wireless encryption keys are unique between each client device and the wireless network. The RADIUS server and client device generates the per-session encryption keys internally. These are never transmitted over the air. The RADIUS server sends the wireless controller a copy of the session key, encrypted with their RADIUS shared secret. As mentioned, each client device has a different key, so even if one key is discovered and/or compromised, it cannot be used to decrypt other traffic. Furthermore, session keys are typically regenerated every 60 minutes.
That’s how enterprise networks are better. They do have their own practical security issues though. One of which has to do with rogue networks, and how users can identify them and distinguish the real one from the fakes. In theory, the problem is a non-issue; but in practice, users don’t know how to tell them apart. This problem is sort of worse than users who simply blindly accept broken SSL certificates when they visit HTTPS websites.
Given the expectation of an always-on Internet access, our quick fix seems a great interim solution to restore normalcy to our wireless services while someone else sorts out the root cause of the original wireless problem. It’s fortunate for us that we already have all the infrastructure pieces available on hand to push out the solution.