HCHTech
Well-Known Member
- Reaction score
- 4,178
- Location
- Pittsburgh, PA - USA
With my business acquisition last year, I inherited kind of an unusual client. They are housed in a university-owned "research campus". They are one of many companies that reside in a gated campus of about a dozen buildings that remind me of 1960's school buildings. Lots of concrete & tile, hot water heat, 8-feet wide stairs to get between floors and all buildings connected by underground tunnels. It's about a 3 block walk from where you park and get your visitors pass to my client's building, so you try to bring everything you might need the first trip. But I digress.
Whoever wired the building thought that one [unlabeled, I might add] cat5 drop per room was sufficient - they have a ton of little 5-port switches scattered around the building to get more equipment in the rooms with only a single network drop. Lots of long cords running around the perimeter of the rooms. Rewiring has been "on next year's budget - for sure" for years. My client has a dozen offices on the 3rd floor, 6 on the 2nd floor and 1 on the first floor. Almost all wiring is surface mount - lots of metal wire mold that looks older than I am. Kind of a disaster.
To make things worse, they were purchased by a California company (we're in Pennsylvania) about 3 years ago, so most of their IT is handled remotely by the CA company. We function primarily as boots-on-the-ground help when onsite work is required. It's one of those clients I wouldn't miss much if they went away. It's always like trying to work with one hand tied behind your back.
To add to the fun, they have lousy power - lots of surges & brownouts. They have big diesel generators as well. We sell a lot of UPSes.
So, they had some kind of power event last Friday evening, and we get a call that they can't get to the server or the internet from any workstation on Monday morning. I talk the victim through rebooting the server, firewall & switch - no help, so I'm on my way.
I get there and it looks like everything is running, but just like they said, no one could browse the server shares, and no workstation I checked had internet access. The server couldn't bring up a web page, and while it appeared there was an internet connection, DNS wasn't working right and any site I could ping had about a 60% loss rate.
We don't have access to the firewall settings unless California deems that necessary, and at 9am, they are still sleeping. But, it's a new Fortinet installed in January when the last surge fried their Sonicwall. I configure a laptop and connect directly to their WAN feed, it's a little wonky, but after a call to the ISP, they reset their modem remotely and it starts working fine. Another hindrance, we don't have access to the ISP modem, it's in a locked wiring room somewhere in the bowels of the building since they provide internet to the whole campus. All we get is a network wire coming down from the ceiling with their name on it. It's DSL - I think, anyway - haha.
So next, I power-cycle the firewall, reconnect the WAN feed and configure a laptop with the same IP as the server and connect to the LAN port of the firewall. All good there. So the problem is with the server itself or the switch maybe. Unfortunately, I missed my opportunity here to unplug everything from the switch and leave just the server & the firewall connected for a test. Had I done this, my day would have been a lot shorter. No, I decide to reset the networking on the server and do a reboot. During the reboot, I take the opportunity to go into the RAID configuration and check the arrays (a RAID1 for the OS and a 4-disk RAID10 for the data drive. All drives are reporting ok and both arrays show optimal status. I reboot from there and the server takes a llllloooooonnnnnnnggggg time to boot up, probably 45 minutes or so. It's a 4-year old HP, middling horsepower. Yikes. Finally, it comes back up and I go through the logs to look for a cause of the long boot. Nothing apparent other than lots of DNS errors. Still not taking the hint, I spend a measurable amount of time going through the DNS manager, it looks like it is configured correctly. The server has dual NICs so I disable the current one, enable the secondary & get it configured, no improvement of symptoms.
Finally, I decide to swap out the switch with a spare I had brought. I get the new switch in place on top of the existing one and plug in just the firewall and server. Aha - the server can suddenly browse the internet and 100% of the pings I try go through. So finally, I do what i should have done at the start of the darned day, I unplug everybody from the existing switch, plug in just the firewall and server, and it continues to work. It isn't a managed switch, but there are only about 28 connections, so I setup a continuous ping on the server and start plugging in the wires one at a time with a few seconds wait. #23 is the one. As soon as I plug it into the switch, pings start dropping. I leave it out and plug in the rest of the wires - all good-to-go. So...what's on the other end of line 23? No wall jacks are labeled, so I have no idea.
By this time all but 2 employees have found better uses for their time than waiting for me to finish and have left the building. 5 or 6 employees are also away on a client visit (with locked offices anyway) and won't be back for a couple of days. I start looking in offices to begin a count, but I can see this is not a job I can finish today.
I decide to leave the line unplugged and tell the receptionist to call me when whoever-it-is reports that they don't have internet. I'll go back onsite tomorrow or the next day, whenever the person or persons identify themselves, replace a NIC or stupid little switch and be done.
Frustrating and not my best work.
Whoever wired the building thought that one [unlabeled, I might add] cat5 drop per room was sufficient - they have a ton of little 5-port switches scattered around the building to get more equipment in the rooms with only a single network drop. Lots of long cords running around the perimeter of the rooms. Rewiring has been "on next year's budget - for sure" for years. My client has a dozen offices on the 3rd floor, 6 on the 2nd floor and 1 on the first floor. Almost all wiring is surface mount - lots of metal wire mold that looks older than I am. Kind of a disaster.
To make things worse, they were purchased by a California company (we're in Pennsylvania) about 3 years ago, so most of their IT is handled remotely by the CA company. We function primarily as boots-on-the-ground help when onsite work is required. It's one of those clients I wouldn't miss much if they went away. It's always like trying to work with one hand tied behind your back.
To add to the fun, they have lousy power - lots of surges & brownouts. They have big diesel generators as well. We sell a lot of UPSes.

So, they had some kind of power event last Friday evening, and we get a call that they can't get to the server or the internet from any workstation on Monday morning. I talk the victim through rebooting the server, firewall & switch - no help, so I'm on my way.
I get there and it looks like everything is running, but just like they said, no one could browse the server shares, and no workstation I checked had internet access. The server couldn't bring up a web page, and while it appeared there was an internet connection, DNS wasn't working right and any site I could ping had about a 60% loss rate.
We don't have access to the firewall settings unless California deems that necessary, and at 9am, they are still sleeping. But, it's a new Fortinet installed in January when the last surge fried their Sonicwall. I configure a laptop and connect directly to their WAN feed, it's a little wonky, but after a call to the ISP, they reset their modem remotely and it starts working fine. Another hindrance, we don't have access to the ISP modem, it's in a locked wiring room somewhere in the bowels of the building since they provide internet to the whole campus. All we get is a network wire coming down from the ceiling with their name on it. It's DSL - I think, anyway - haha.
So next, I power-cycle the firewall, reconnect the WAN feed and configure a laptop with the same IP as the server and connect to the LAN port of the firewall. All good there. So the problem is with the server itself or the switch maybe. Unfortunately, I missed my opportunity here to unplug everything from the switch and leave just the server & the firewall connected for a test. Had I done this, my day would have been a lot shorter. No, I decide to reset the networking on the server and do a reboot. During the reboot, I take the opportunity to go into the RAID configuration and check the arrays (a RAID1 for the OS and a 4-disk RAID10 for the data drive. All drives are reporting ok and both arrays show optimal status. I reboot from there and the server takes a llllloooooonnnnnnnggggg time to boot up, probably 45 minutes or so. It's a 4-year old HP, middling horsepower. Yikes. Finally, it comes back up and I go through the logs to look for a cause of the long boot. Nothing apparent other than lots of DNS errors. Still not taking the hint, I spend a measurable amount of time going through the DNS manager, it looks like it is configured correctly. The server has dual NICs so I disable the current one, enable the secondary & get it configured, no improvement of symptoms.
Finally, I decide to swap out the switch with a spare I had brought. I get the new switch in place on top of the existing one and plug in just the firewall and server. Aha - the server can suddenly browse the internet and 100% of the pings I try go through. So finally, I do what i should have done at the start of the darned day, I unplug everybody from the existing switch, plug in just the firewall and server, and it continues to work. It isn't a managed switch, but there are only about 28 connections, so I setup a continuous ping on the server and start plugging in the wires one at a time with a few seconds wait. #23 is the one. As soon as I plug it into the switch, pings start dropping. I leave it out and plug in the rest of the wires - all good-to-go. So...what's on the other end of line 23? No wall jacks are labeled, so I have no idea.
By this time all but 2 employees have found better uses for their time than waiting for me to finish and have left the building. 5 or 6 employees are also away on a client visit (with locked offices anyway) and won't be back for a couple of days. I start looking in offices to begin a count, but I can see this is not a job I can finish today.
I decide to leave the line unplugged and tell the receptionist to call me when whoever-it-is reports that they don't have internet. I'll go back onsite tomorrow or the next day, whenever the person or persons identify themselves, replace a NIC or stupid little switch and be done.
Frustrating and not my best work.