I was setting up a couple of stacks of Dell N3048 switches recently and ran into a strange bug when getting authentication working with RADIUS. Both stacks of switches were running version 22.214.171.124 (latest as of writing).
The general setup of the two stacks:
- Both stacks connected to each other
- About 10 VLAN’s setup. Each stack is a member of 4 VLAN’s unique to each switch, both stacks are in 2 of the VLAN’s. OSPF is setup between them so that they can reach all networks.
- Windows Network Policy server used as the RADIUS server, it has an IP of 192.168.30.240. It is connected to VLAN100, only one of the stacks has an IP in this range.
First, I confirmed both stacks of switches can ping the RADIUS server, it is successful. I then enable RADIUS auth on both switches:
aaa authentication login "networkList" local radius radius-server key "mykey" radius-server host auth 192.168.30.240
Once done, I then telnet to one of the stacks and try and authenticate with my RADIUS login. It appears to time out after entering my username – there is a large delay before it asks for the password. I switch back to my existing session and check the log and see the requests to the RADIUS server are timing out (confirmed by showing the RADIUS server statistics). Strange, it should be able to reach it. Looking at the logs a bit closer I see that my OSPF session dropped and reestablished at the same time I attempted to authenticate.
I then try pinging the RADIUS server again but it is no longer responding, after viewing the log again I see the following:
Apr 22 10:56:41 level14-stack-1 General[procLOG]: ping_debug.c(627) 213477 %% [VRF-ID:0] Cannot allocate entry - duplicate name and index Apr 22 10:56:40 level14-stack-1 General[procLOG]: ping_debug.c(627) 213476 %% [VRF-ID:0] Cannot allocate entry - duplicate name and index Apr 22 10:56:39 level14-stack-1 General[procLOG]: ping_debug.c(627) 213475 %% [VRF-ID:0] Cannot allocate entry - duplicate name and index
I don’t have any VRF’s setup, so this is the default VRF.
I try pinging other hosts but get no response at all. Checking the logs, the same error is logged again. I decide to reboot the current stack master so that the backup unit takes over. I verify that ping works again, it does. I try and authenticate again with RADIUS and then run into the same problem – I can no longer ping any hosts from the stack (pinging TO the stack works fine still). I reload the master again so that it is back to a working state and do some more trouble shooting.
My next step is to run 100 ping requests to the RADIUS server. While that is running I try and authenticate again with my RADIUS login and surprise, it works. I try after the ping has completed, and then run into this bug again.
I called Dell and opened a support case with the above information. I got the following response a few days later:
Our senior technician has confirmed the issue you reported exists on FW version 126.96.36.199 and was not present on FW 188.8.131.52. Engineering will work on a fix for the next firmware release. At the moment there are limited workarounds: - Downgrade to FW 184.108.40.206 where the issue is not present - Keep using FW 220.127.116.11 with local authentication
So if you are having problems with the N3000 series (I would presume it also applies to the N2o00 series switches) on 18.104.22.168 with RADIUS, it is a bug and most likely not your configuration.