UniFi switch and FortiGate firewall - can't ping past switch for some users

timeshifter

Well-Known Member
Reaction score
2,254
Location
USA
Got a weird intermittent network issue.

Background: small network with 11 PCs, one Windows Server. FortiGate 60E firewall. Ubiquiti 24 port switch.

Problem: only some users but not always the same users. Since I've been working two users currently having the issue are ones who turn their PC off at night or let it sleep. But not only those two. A different user had the issue today.

So the "bad" PC will have a local IP address. I can ping anything on the local network - the server, a printer, another PC, etc. Cannot ping the gateway or anything past that. Today I had to get one bad PC out of 11 back online. Here's what I tried:

Switching user to different port on switch
Boot PE disk and try connection, behaviour was the same, had same IP address as live Windows
Boot PE disk and manually configure to an available local IP address, no help
Boot back to Windows, no improvement
Install USB WiFi adapter and connect PC that way and all was fine (not a permanent fix of course)

Yesterday I did resets of all network gear, etc. including the switch.

Additional background: this customer just got a new cloud based LOB app that prefers a site to site VPN. Up until a few weeks ago there were no issues. To connect to their app each user had to run Cisco AnyConnect. We changed over to having the firewall handle that with a site to site configuration, as there will be new barcode scanners that require that (and I figured it would be cleaner - ha!).

Old configuration, FortiGate LAN port 1 had a subnet of 192.168.111.0 and no site to site functionality. FortiGate LAN2 port 2 had subnet 192.168.223.0 and was configured with all the site to site stuff. To the best of my knowledge they were otherwise identical.

To switch over the the new LAN2 I simply (mostly) just moved the switch's connection to the firewall from LAN1 to LAN2. And I changed all the devices and settings to the new subnet. I'm somewhat confident this is when the problems started but I think they were more subtle at first.

I'm working on looking at the firewall console, but for some effed up reason I can't find the password, working on a recovery at the moment. Once I have that it might be all fixed in 5 minutes.

But maybe it's a switch issue.

I don't know, my head is spinning and I've typed enough.

Any thoughts are appreciated!
 
Definitely sounds like a firewall issue. You’ll have to look at the rules and make sure all denies are logged. Then check your log and voila!
 
Is the windows server a DC? Ignoring ping for a moment, can you get to the shared drives? Shared printers? Also, it's possible to shut off response to pings on the firewall, so that symptom could just be a red herring. Look at the firewall first, of course.
 
That you did WinPE with a manually configured IP, different than the prior one, leads me to believe that problem might be tied to the MAC address. Is there MAC rules?
 
That you did WinPE with a manually configured IP, different than the prior one, leads me to believe that problem might be tied to the MAC address. Is there MAC rules?
Agree on it being tied to MAC address, as I was able to get yesterday's user online by adding a wireless adapter to his PC and all was good. (Except he was now wireless)

(was typing this as you responded Mark) ...

This morning I got a call. A different user can't get online. Same issue. He can ping all the local things on network but not the gateway (FortiGate device) and can't ping out to the Internet.

FortiNet tech support helped, we tried to ping back to that PC from the FortiGate console. First attempt it failed. Second attempt it worked. The PC was online now and I was able to remote in and it all looks good for him.

FortiNet tech support believes I have a switch problem, and I agree. Two days ago that's what I was leaning toward but wasn't sure. Yesterday I was pursuing the firewall angle.

So far I've only tried restarting the switch. I don't have a comparable switch to swap out and am going to try some more troubleshooting with the switch in place. But I'm not sure what to try exactly.
 
Any vlans going on?
I'm wondering if the mac table entry for a device got a little wonky. You can clear that out by "forgetting" a device in the webUI of the Unifi controller...which clears a device/clients entry in the database. Can also just clear the mac table via SSH...but I'd also do the "forget" first...as it clears a bit more data of the device/client in the database. Then run the mac clear via SSH..may as well do "all". Just make sure when you "forget" the device/client, it's connected via the ETH port that's giving the problems, not the wlan nic.

Sometimes changing gateway addresses can throw a curve ball into things on the network, mac table/arp table related.
 
Two days ago: Fortinet support suggested the problem was with our switch. We found that while the workstation could not ping the gateway, the gateway could ping the workstation. When the gateway pinged the workstation then the problem would resolve immediately.

(edit: two days ago I also tried removing the switch from the controller and added it back, it seemed to work, mostly. One of the PCs I had to change which switch port it was plugged in to. Low confidence)

(edit: yesterday I noticed one PC not online, didn't have time to fully troubleshoot. When I got home I noticed everything was down. They had had a power outage, all PCs off, but power was back on, so when I went in and discovered that that's when I decided to go ahead and replace switch)

Yesterday I replaced the switch. Had the same problem on most of the PCs as I brought them back online. Out of 11 PCs I had to do the "ping from gateway to PC" trick to bring them online. Fortigate support update the firmware from 6.2.3 to 6.2.5. Everything was working fine after that update.

Today one of the PCs had the issue again. "Ping from gateway to PC" brought it back.
 
Two days ago: Fortinet support suggested the problem was with our switch. We found that while the workstation could not ping the gateway, the gateway could ping the workstation. When the gateway pinged the workstation then the problem would resolve immediately.

(edit: two days ago I also tried removing the switch from the controller and added it back, it seemed to work, mostly. One of the PCs I had to change which switch port it was plugged in to. Low confidence)

(edit: yesterday I noticed one PC not online, didn't have time to fully troubleshoot. When I got home I noticed everything was down. They had had a power outage, all PCs off, but power was back on, so when I went in and discovered that that's when I decided to go ahead and replace switch)

Yesterday I replaced the switch. Had the same problem on most of the PCs as I brought them back online. Out of 11 PCs I had to do the "ping from gateway to PC" trick to bring them online. Fortigate support update the firmware from 6.2.3 to 6.2.5. Everything was working fine after that update.

Today one of the PCs had the issue again. "Ping from gateway to PC" brought it back.
What type of switch did you use?
 
Crappy

:D

Actually used three switches. Two 8 port cheap switches, AirLink I think was the brand. One small decent 6 port PoE switch, don't know brand.

Yes, I would have preferred to use a single decent quality switch, but had to go with what I had available at 10:00 PM on a Sunday night.
 
Crappy

:D

Actually used three switches. Two 8 port cheap switches, AirLink I think was the brand. One small decent 6 port PoE switch, don't know brand.

Yes, I would have preferred to use a single decent quality switch, but had to go with what I had available at 10:00 PM on a Sunday night.
Must be a new brand. LOL!!!

But that's what I would have done. Use an unmanaged switch to keep things simple.
 
I think this mystery may have been solved. Today (Sunday, they're closed) I noticed one PC was offline. Not knowing why - powered off or experience the glitch - I came to the office. It was powered off, so I turned it on. It was experience same issue, can ping local things but not gateway or further out.

Ran arp -a on the PC. The MAC for the gateway piqued my interest. I ran arp -a on the PC next to it. It had a different MAC. Turns out that the MAC on the "bad" PC's ARP table was the MAC address of a printer.

I decided to see what would happen if I removed the printer. Before I did anything I did an ipconfig /release and then ipconfig /renew and checked. No change. Still had wrong MAC, etc.

I unplugged the power to the printer. Walked back to the PC and did an APR -a. MAC is CORRECT! Everything on this PC now works.

Do you think this is the end of this issue? (the printer will not be returning to their office)
How do you explain the behavior?
 
That's a weird one. Make and model of the printer? Does it have it's own builtin printer server? Were there any builtin services running?
 
HP Envy 5055

Don't know the answers to your other questions. It's a cheap little printer. Did have problems with it when I switched the network over. I couldn't ping it. I think my Mac could discover it but could never ping it. Also noticed that it must have been the first device to connect to the new network as it's IP was 192.168.223.2.
 

Say no more, haha. Cheap little consumer printer - has no business being in an office environment. "Fast & easy wifi setup" is one of it's "features", so probably some variant of wifi-direct is always running. I would withhold comparing the price of that thing to the bill for your time! :)
 
Not even worth looking at given the model.
You mean further troubleshooting to get the printer working properly? If yes, I agree. I immediately began fantasies of going all Office Space on that $&@! PC Load Letter - WTF!

And I’m curious, just because the printer was caught holding the bag doesn’t necessarily prove that it’s his fault.

A friend of mine believes it was acting as a DHCP server.

I brought the printer home and may set it up in a lab and see if I can reproduce the issue.

Also curious why it didn’t seem to do that BEFORE I switched the network over.
 
You mean further troubleshooting to get the printer working properly? If yes, I agree. I immediately began fantasies of going all Office Space on that $&@! PC Load Letter - WTF!

And I’m curious, just because the printer was caught holding the bag doesn’t necessarily prove that it’s his fault.

A friend of mine believes it was acting as a DHCP server.

I brought the printer home and may set it up in a lab and see if I can reproduce the issue.

Also curious why it didn’t seem to do that BEFORE I switched the network over.
That's why I was asking about print server and other services. That symptom, wrong gateway, happens when there's something alien leaking and/or directing traffic. It's rare but I've seen it before.
 
Maybe it’s the hated WiFi direct feature on many HP printers.
That feature does create it's own WiFi LAN but it's usually a 172. based private subnet. So It's possible it may have broadcast it's LAN IP as a gateway. Which may be how it got picked up. It would be interesting to know the each "problem" event coincided with a print job prior to the problem. Unfortunately I never use that feature so have little experience with it.
 
Back
Top