I've got freeradius 2.1.12 running under Ubuntu 14.04, authenticating against a Windows 2012 Active Directory controller. While working fine most of the time, it may happen after a few days of running that suddenly valid users fail to authenticate. The log may then show things like
Login incorrect (mschap: External script says ): [john.doe] (from client myclient port 0 via TLS tunnel)
Login incorrect: [john.doe] (from client myclient port 1 cli 20-E4-A2-67-8D-4A)
Note that a "really" wrong login would have some explanation like (mschap: External script says Logon failure (0xc000006d))
in it.
Restarting the freeradius service helps, but why does this happen in the first place?
EDIT: Upon closer inspection of common events, it appears that the failure begins to occur (if at all) in the morning right with the first login attempt after a daily SIGHUP:
08:04:33 2016 : Info: HUP - loading modules
08:04:33 2016 : Info: Module: Reloaded module "attr_filter.access_reject"
08:04:33 2016 : Info: Module: Reloaded module "attr_filter.accounting_response"
08:04:33 2016 : Info: Module: Reloaded module "pap"
08:04:34 2016 : Info: Module: Reloaded module "radutmp"
08:04:34 2016 : Info: Module: Reloaded module "suffix"
08:04:34 2016 : Info: Module: Reloaded module "files"
08:04:34 2016 : Info: Module: Reloaded module "detail"
08:04:34 2016 : Info: Module: Reloaded module "mschap"
08:04:34 2016 : Info: Loaded virtual server <default>
08:04:34 2016 : Info: Loaded virtual server inner-tunnel
10:39:00 2016 : Error: [ldap] ldap_search() failed: LDAP connection lost.
10:39:00 2016 : Info: [ldap] Attempting reconnect
10:39:01 2016 : Auth: Login incorrect (mschap: External script says ): [someusername] (from client astaro port 0 via TLS tunnel)
10:39:01 2016 : Auth: Login incorrect: [someusername] (from client astaro port 1 cli 20-E4-B8-78-8E-32)
However, the ldap error appears to be "normal" for the daily HUP routine and it may just as well happen that the next log entry after "Attempting reconnect" is a "Login OK". Also, I don't see a hint that the reconnection attempt might have failed (I would have expected at least a warning in that case). I do not see anything special at the AD server.
I'm still unsure about the underlying reason, but since a service restart has so far always helped, I edited /etc/logrotate.d/freeradius
and changed the postrotate
action from reload
to restart
.
To verify the positive effect faster, I will also change weekly
to daily
for a while.
User contributions licensed under CC BY-SA 3.0