And people wonder why I hate Crowdstrike...

I’m aware of that. You missed my point. To get those systems booted all they had to do was load the previous kernel. (This as assuming that you retain copies of your previous kernel when you patch it. Not everyone does). It’s an easier fix than booting into safe mode than Windows.
I'm sorry, but that's not true. Because there was no previous kernel to boot. And if there was, it also loaded the appropriate software that triggered the bug, because it had a previous version of the same software loading the same busted definition files!

Fixing the Red Hat systems involved booting into single user mode, and removing the impacted file. Which is the EXACT SAME PROCESS we're seeing on the Windows side of the fence.

It doesn't matter what your kernel is, they fail the same way. This specific failure mode is always present, and it ALWAYS SUCKS when it's triggered.
 
Last edited:
This was quite a whopper. Not for me really...we do not sell/support/etc Crowdstrike. But it could have been any other product. Years ago we used to resell a lot of Eset....so much volume...we made it to "Gold Partner" level for a while. This is back in the NOD32 days, I think we started at version 2.50. Anyways, they had several instances where...on prem Exchange servers were tanked by their Exchange module, from some update. The infostore would just dismount and fall over. Yup, back in the days of on prem Exchange Servers.

Been around for some other AV programs that tanked Windows bootups, but yeah I'll say...haven't seen anything "this big" in a while.

My colleague at the office, his wife works for a larger hospital chain...their billing department or something. She works from home. And...her departments...over a hundred others that work from home. The hospital IT had to call each end user...and walk them through the bootup/recovery process, enter bitlocker key, etc. Imagine that...having to call that many end users and...going through that process. I'm wondering how many end users each tech could get done per day....but you can count on one hand...
 
CrowdStrike have released an automated resolution!

Part of the CrowdStrike service must load very early in the boot process and communicate with their servers before the BSOD occurs. Using this, they are able to send instructions for the faulty files in %WINDIR%\System32\drivers\CrowdStrike\ to be quarantined on next reboot. Basically if a device has a wired internet connection it will fix itself.

Great news but oddly it's not being pushed to customers automatically. You have a raise ticket and opt-in. Also they don't appear to have released much, if any, public information on this. There will 100% be IT teams still fixing systems manually who don't know about it.
 
@SAFCasper Yeah, and it doesn't work because the NIC hasn't loaded yet. They told that line, we tried it on some workstations... no dice.

The servers have all been remediated by the time that message went out.

@ThatPlace928 I've got a customer retaining legal council to sue their cyber insurance provider because they DEMANDED the deployment of CrowdStrike over our security team's objections to remediate a cypto event. And then this happened... which knocked that medical clinic down AGAIN for the 2nd time in a month.

There are literally bodies on the floor, this is beyond civil if our nation had a brain cell they'd be holding the entire board of CrowdStrike, especially that idiot CEO of theirs to criminal charges. We won't do that of course, because rich people never face justice in the US. But there will at least be a class action.

There's at least one place looking to get a class action started: https://www.classlawgroup.com/consumer-protection/crowdstrike-outage-class-action-lawsuit

I imagine they won't be alone.
 
Last edited:
@Metanis Highly accurate information once you get past all the historical fluff.

Software development and testing just doesn't work the way he remembers it anymore, and for a ton of good reasons. But the breakdown of what CrowdStrike did is there.

I'll add one more thing... Windows used to have the ability to bypass system driver faults, XP did this. Microsoft had to disable that because users would mash the button to get their systems online regardless of what the actual fault was, this enabled malware persistence on a catastrophic scale.

I do think it should accept a local admin password for bypass... but the current system is vastly preferable to the previous one. Which, oddly enough... would have been one of the last ones this guy worked on.

@ThatPlace928 I'm not holding my breath. That body is incapable of understanding technology. Right or left, these people are too bloody old! We have literal bodies on the floor, this isn't haul them before Congress time. This is line them up in front of the firing squad time.
 
This is line them up in front of the firing squad time.

Accidents, even tragic ones, will happen. Always have, and always will.

We haven't been executing people for them in a long time. Your hyperbole about "bodies on the floor" and "line them up in front of a firing squad" is just plain ugly, and needs to stop.
 
This was quite a whopper. Not for me really...we do not sell/support/etc Crowdstrike. But it could have been any other product. Years ago we used to resell a lot of Eset....so much volume...we made it to "Gold Partner" level for a while. This is back in the NOD32 days, I think we started at version 2.50. Anyways, they had several instances where...on prem Exchange servers were tanked by their Exchange module, from some update. The infostore would just dismount and fall over. Yup, back in the days of on prem Exchange Servers.

Been around for some other AV programs that tanked Windows bootups, but yeah I'll say...haven't seen anything "this big" in a while.

My colleague at the office, his wife works for a larger hospital chain...their billing department or something. She works from home. And...her departments...over a hundred others that work from home. The hospital IT had to call each end user...and walk them through the bootup/recovery process, enter bitlocker key, etc. Imagine that...having to call that many end users and...going through that process. I'm wondering how many end users each tech could get done per day....but you can count on one hand...

Talk about earning your money doing that lol
 
Good video on the issue in general.
That former Windows developer is suggesting that CrowdStrike's "content updates" (apparently data only) contain P code (machine language?) that is executed by their driver. So effectively their content updates actually result in new code being executed in kernel mode. Because it isn't officially a driver update, it isn't carefully controlled by their customers.

I've read elsewhere that this content update file contained all zeros. Wouldn't that be one of the first automated test when the new driver version was released?
 
That former Windows developer is suggesting that CrowdStrike's "content updates" (apparently data only) contain P code (machine language?) that is executed by their driver. So effectively their content updates actually result in new code being executed in kernel mode. Because it isn't officially a driver update, it isn't carefully controlled by their customers.

I've read elsewhere that this content update file contained all zeros. Wouldn't that be one of the first automated test when the new driver version was released?
That is why I make the claim they have faulty or non-existent unit testing. Which in turn leads to faulty or non-existent Q/A testing... Which in turn leads to a terribly designed kernel driver, flagged as boot critical, within any OS implementing it that lacks the capability to fail gracefully when fed corrupted content.... which as you pointed out the first test of which is... you guessed it... all zeros.

CrowdStrike just showed us, they are not to be trusted, they are not to be used, and they've failed to learn a single infrastructural security lesson learned by the industry in the last twenty plus years. That's pretty categorically damning for a "security" company.
 
That former Windows developer is suggesting that CrowdStrike's "content updates" (apparently data only) contain P code (machine language?) that is executed by their driver. So effectively their content updates actually result in new code being executed in kernel mode. Because it isn't officially a driver update, it isn't carefully controlled by their customers.

I've read elsewhere that this content update file contained all zeros. Wouldn't that be one of the first automated test when the new driver version was released?
Essentially. They didn't want to constantly be backlogged by the validation process for the driver, so they sidestepped the issue and got caught with pants down.
 
Essentially. They didn't want to constantly be backlogged by the validation process for the driver, so they sidestepped the issue and got caught with pants down.
I can't find the details but I'm remembering a law suit joined by most of the AV vendors at the time against Microsoft. This was shortly after Microsoft made it known that Windows 10 was going to ship with free AV software and everyone got bent out of shape.

If I recall correctly, this situation was created in response to that law suit to give AV vendors the hook they needed to monitor the kernel. But goodness I've slept a few times since then, it was a decade ago... that would have been 2013-2014.
 
I knew I wasn't nuts!


The above video blames the EU for forcing Microsoft to NOT implement the security improvements planned that are currently available in Linux kernels, and mandated on MacOS that would have prevented the CrowdStrike platform from taking out windows machines everywhere.

But if you read the actual ruling, what Dave got wrong.... is a nuance but a critical one.

The EU ruling didn't state that Microsoft couldn't implement these improvements. What it stated was that all anti-malware products must be subject to the same API limits, including Microsoft's own.

So Apple implemented the fix for this ages ago, and that's why CrowdStrike never took them out. They made mistakes, but NO SOFTWARE can run in the MacOS kernel directly which means it's vastly less likely to crash on anything ever.

Microsoft HAS A KERNEL THAT CAN DO THIS TOO! But they CHOSE NOT TO USE IT because they apparently REFUSED TO LIMIT THEIR OWN AV SOFTWARE to the API properly deployed in the Windows Kernel!

MS had the technology to prevent the current outage, in 2009!
 
AT the end of the day in the red corner we have all the software and tech and in the blue corner we have Frank. For as long as human are invovled we will issues. I keep hearing that Crowdstrike should all be shot..............................HUH. Again I say its software with humans doing checking before rollouts IT JUST DOESN'T MATTER what name is on the product. A few weeks ago here in New Zealand power went out for the northern part of the country. The cause some work was being done on Power Pylons that carry the 220KVA lines. The workers had to remove nuts of the mountign bolts that bolt these Pylons to the ground. Worker removes ALL nuts from 3 of the 4 legs pylon quite rightly fell over. Contractors doing the job "We have procedures and policies in place as to how to do job, its never happened before" IT JUST DOESN'T MATTER as long as Frank is involved
 
Again I say its software with humans doing checking before rollouts IT JUST DOESN'T MATTER what name is on the product.

Amen, amen, amen!

My mother, God rest her soul, gave me a coffee mug not long after I got my degree in computer science that I still have and use to this day. On the side it has the words, "To err is human, to really screw things up requires a computer."

All that happened here is a not unheard of technological magnification of a human error at light speed. Why people are acting like it's anything other than that I don't understand.

All of us, I think, believe it was unacceptable human error, as this kind of thing should never have been able to make it out "to the wild" so at least several human someones dropped the ball at several different steps. But it all boils down to human stupidity. And no one had a better observation on that than Einstein:

Only two things are infinite, the universe and human stupidity, and I'm not sure about the former.
~ Albert Einstein
 
Back
Top