Quite interesting.

I'm an aggressive null checker. Internal functions are mostly debug if null is like a static control flow error.

Address 0x9c should be a protected region so worst that could happen was a crash? If I'm not mistaken the process dump showed an protected read fault.

Top voted comment on HN suggests (just like I would) "This should not have passed a competent C/I pipeline for a system in the critical path."

But besides banging on my keyboard, software bugs are going to happen. Adding more complexity is unlikely to fix the issue for a long time. Maybe would shouldn't depend on things where a single mistake in a single line of code (that was found) causes the world to nearly melt down. We've had memory safe languages for a long time, guess what, programmer errors still cause application crashes. Also many (if not all) "safe" languages just defer the unsafe stuff to the compiler/interpreter. The majority of the C# standard library uses pointers and P/Invoke.

Lets just keep making tool-chains larger and supply chains deeper I guess until the problem is resolved... because that works. I'm done.

Reply to this note

Please Login to reply.

Discussion

This shouldn't even have passed a manual smoke test or system integration test. Literally meant you couldn't get the machine to start without a Blue Screen of Death.

There are about five different tests that would have all failed, as soon as you started. They apparently don't test.

Also, amazes me that the sys admins just rolled this out everywhere without even checking locally on one machine.

I wonder if anyone, anywhere, caught this before rollout.

It's just another Boeing incident lol

I know. Good grief.

Yeah that's crazy!

I'd guess sysadmins had nothing to do with it, and the update was pushed live over the air to all CrowdStrike customers because "it's a security update."

When everything is "X as a Service" these problems will occur more often, because end users don't have the choice to wait for updates to be validated in the wild before making those changes on their machines.

Oh, that's true.

Gosh, they're doing automatic roll-outs without a proper pipeline. That is scary.

On the other hand, customers have been rewarding this sort of reckless behavior. We can see that on here.

Yeah, fair share of people here advocating for aggressively fast updates "even if it breaks shit". There's always a balance.

That's one reason I trust Nostrudel. They offer a cutting-edge version "next.nostrudel.ninja" where they also don't activate new versions until you press the button. A couple of times, it broke something, so I just went and used the normal one, again.

Beta test environment, basically.

We used to do this at work. A public staging environment.

It think this sort of parallel release alleviates a lot of the curiosity the users feel (they want to "see" the new thing and try it out), while mitigating risk.

Even if you have a desktop or mobile app, you can created a simulated/test environment to show off a new feature and get feedback from users. There are Android emulators and stuff.

IMO frequent for security, and heavy testing regime and approval signoff pipeline

for everything else, make it optional, at all

The problem is not that there fast increments, it's that the increments are often barely tested, unreleased, and bug-fixes are rolled in with new features. So, you're basically forced to install every update, immediately, because what they delivered the last time is broken and this new one contains the fix. But the new one is also broken.

Software Version Mafia 😂

And everything straight to full-rollout to production. Boom. And then everyone installs it and it immediately crashes. That's not even an alpha version, it's just a prototype.

And then it's like,

Ummm... my bad, reinstall the previous one.

yeah, move fast and break things should not be the motto of the QA lol, QA should be pig headed salty bastards who are sticklers for an extensive test routine to be run and passed, and i mean, like, a giant long checklist of features that should work that have to be run through a standard regime

and not even touch it without the unit tests all 100%, fix the damn tests, and no damn disabling the damn tests damn you

yes, the salty bastard rubber stamp bureaucrat style is what we want in QA

And no unit tests that are just "assert True; return;" 😅

We call those people "beta testers," and their enthusiasm should be channeled to help produce a stable product for the average user.

EXACTLY!!

I'm a natural for the beta tester role, as are many of the power users, and we tend to be mentally prepared for the software to mess our stuff up or act wonky.

Roll out to us _first_ and then roll out to everyone else a couple of days or a week, later.

And make the updates opt-in. Sure, it's better for security to stay fully up-to-date, but if someone wants to never update their desktop, it's on him if he gets hacked.

In commercial systems, the sysadmins should be responsible for validating and determining the necessity of updates.