Let's blame the dev who pressed "Deploy" - by Dmitry Kudryavtsev

Mac@programming.dev · edit-2 4 months ago

Let's blame the dev who pressed "Deploy" - by Dmitry Kudryavtsev

quinkin@lemmy.world · 4 months ago

If a single person can make the system fail then the system has already failed.

MTK@lemmy.world · edit-2 4 months ago

If only we had terms for environments that were ment for testing, staging, early release and then move over to our servers that are critical…

I know it’s crazy, really a new system that only I came up with (or at least I can sell that to CrowdStrike as it seems)

ByteOnBikes@slrpnk.net · 4 months ago

It’s never a single person who caused a failure.

solrize@lemmy.world · 4 months ago

Note: Dmitry Kudryavtsev is the article author and he argues that the real blame should go to the Crowdstrike CEO and other higher-ups.

polle@feddit.org · 4 months ago

Microsoft also started blaming th eu. Its such a shitshow its ridiculous.

https://www.tomshardware.com/software/windows/microsofts-eu-agreement-means-it-will-be-hard-to-avoid-crowdstrike-like-calamities-in-the-future

iAvicenna@lemmy.world · 4 months ago

sure it is the dev who is to blame and not the clueless managers who evaluate devs based on number of commits/reviews per day and CEOs who think such managers are on top of their game.

Kissaki@programming.dev · edit-2 4 months ago

It’s a systematic multi-layered problem.

The simplest, least effort thing that could have prevented the scale of issues is not automatically installing updates, but waiting four days and triggering it afterwards if no issues.

Automatically forwarding updates is also forwarding risk. The higher the impact area, the more worth it safe-guards are.

Testing/Staging or partial successive rollouts could have also mitigated a large number of issues, but requires more investment.

wizardbeard@lemmy.dbzer0.com · 4 months ago

The update that crashed things was an anti-malware definitions update, Crowdstrike offers no way to delay or stage them (they are downloaded automatically as soon as they are available), and there’s good reason for not wanting to delay definition updates as it leaves you vulnerable to known malware longer.

merc@sh.itjust.works · 4 months ago

And there’s a better reason for wanting to delay definition updates: this outage.

Let's blame the dev who pressed "Deploy" - by Dmitry Kudryavtsev

Let's blame the dev who pressed "Deploy" - by Dmitry Kudryavtsev

Let's blame the dev who pressed "Deploy" - Dmitry Kudryavtsev