Dell PE 6950 | Black Screen | Random Intervals

jp425

New Member
Reaction score
0
Hey everyone. About to pull my hair out on this one and need some serious help.

Have a Dell PE 6950 (quad Opteron 8360s, 32GB DDR2 ECC Ram, 10TB in Raid0, redundant PSUs). System is randomly going to a black screen during install of OS. Fans continue to run however I obviously loose the screen and the remote console through DRAC also cuts out. I am still able to remotely reboot from within DRAC so something is still working.

I have tried numerous operating systems (08r2, 12r2, cent, ubuntu, esxi, proxmox) just in an attempt to get anything to install. All have the same black screen issue. DRAC shows no errors in the logs or any sort of thermal warning. Ran Memtest on it today and it sat for 6 hours while it ran with no issues. Immediately following that I was able to get about 95% through the ubuntu install before it went to a black screen. Since then I get to the point where ubuntu asks me my keyboard localization before it goes black (if I'm lucky).

So far I have tried:

Reseating and pasting the cpu's
Memtest
Dell diagnostic which passes everything
multiple OS
both inside the rack and outside the rack (trying to rule out additional ambient heat)
headed and headless installs
Video Card installed (I know dells hate additional cards but I was desperate to rule out an issue with the gpu)
checked logs (logs show no crash or anything just me powering the system down and powering it back up)
Bios reflashed to most up to date
Swap of individual PSUs

At this point I'm at a total loss on where to even go moving forward. Diagnostically I'm out of ideas especially since its so random at least in its timing. Unfortunately, and yes I know its a 10 year old server at this point, we don't have it in our budget to replace until sometime early next year (gotta love shoestring tech company budgets). So before I raise the white flag and take it out back and go office space on it.. HELP
 
Last edited:
10TB in RAID 0? That's just asking for trouble!

You need to rule out the RAID array, even if it means popping in a single small drive and installing an OS on that. It shouldn't take long and you'll have eliminated the most insane part of your configuration right there. Using an SSD for this will also make you look like a hero.

While you're getting your hands dirty you could also swap out the RAM. Yes, I know you said Memtest thinks it's OK but why would you trust that? If Memtest says it's bad then it's bad, if Memtest says it's good then it might still be bad. Life's like that.

And the inevitable questions: Why are you installing a new OS on this dinosaur - is it being repurposed or did it break? If so, why and/or how? And what's changed since it last worked? And how does it run from a live Linux CD?
 
Last edited:
10TB in RAID 0? That's just asking for trouble!

You need to rule out the RAID array, even if it means popping in a single small drive and installing an OS on that. It shouldn't take long and you'll have eliminated the most insane part of your configuration right there. Using an SSD for this will also make you look like a hero.

While you're getting your hands dirty you could also swap out the RAM. Yes, I know you said Memtest thinks it's OK but why would you trust that? If Memtest says it's bad then it's bad, if Memtest says it's good then it might still be bad. Life's like that.

And the inevitable questions: Why are you installing a new OS on this dinosaur - is it being repurposed or did it break? If so, why and/or how? And what's changed since it last worked? And how does it run from a live Linux CD?
yep. unplug everything possible and run with the most minimal hardware you can. If you're trying to install from a USB then try to install from CD/DVD.
Immediately following that I was able to get about 95% through the ubuntu install before it went to a black screen. Since then I get to the point where ubuntu asks me my keyboard localization before it goes black (if I'm lucky).
the problem is not consistent? This tells me you have failing hardware. Ram, motherboard, hdd...
 
yep. unplug everything possible and run with the most minimal hardware you can. If you're trying to install from a USB then try to install from CD/DVD.

the problem is not consistent? This tells me you have failing hardware. Ram, motherboard, hdd...

Some of the stuff you guys have mentioned I've tried already and honestly just forgot about it.

10TB raid0 was honestly just because it's what I get when I spam enter in the raid config after I blew up the raid array and tried a single drive.

As far as swapping ram I've thought that be unfortunately We don't have any ddr2 ecc ram with proper speed in the shop.

As far as the ancient beast it was a server we got at the end of a server upgrade job. We've been desperately needing more space on our file server so originally it was suppose to run freenas and act as a temp file solution until the owner says a proper solution is in the budget. As far as I'm aware the server was running properly at the clients location but was shut down and decommissioned a while ago.

I agree it sounds hardware related but none of the symptoms point to one thing specifically. My honest guess is ram or a bad CPU. In current configuration it has the minimal ram it can run with so I may pull the two extra processors and then I have some free ram to play with. Will help rule out ram or CPU issues.
 
Also almost all of those OSes you tried to install produce install logs. You'll have to pull a drive or boot on a live cd to view them. Might have a clue what is failing.
 
So you have not tried to install a Windows Server OS on this?
I've tried installing 08R2 and 12R2 on it neither actually install.

Actually managed to get Ubuntu Server to install the other week and have been pouring over logs since (If I stay in terminal the system is 100% stable assuming I don't try and run anything short of like nano). Logs are fairly clean, besides what looks like it may be an issue with video drivers but nothing that would explain the crashes, though I did update everything just to play it safe. Tried a cpu benchmark on it and as soon as it got to full load it crashed. At this point I'm fairly certain its going to be a cpu issue so I'm going to pull the two extra and see where I can go from there. Unfortunately this week we're running a special so chances of me getting to it outside of personal time are slim to none.
 
Back
Top