Why We Keep Making the Same Software Mistakes

Talking to Robert N. Charette can be pretty depressing. Charette, who has been writing about software failures for this magazine for the past 20 years, is a renowned risk analyst and systems expert who over the course of a 50-year career has seen more than his share of delusional thinking among IT professionals, government officials, and corporate executives, before, during, and after massive software failures.
In 2005's Why Software Fails," in IEEE Spectrum, a seminal article documenting the causes behind large-scale software failures, Charette noted, The biggest tragedy is that software failure is for the most part predictable and avoidable. Unfortunately, most organizations don't see preventing failure as an urgent matter, even though that view risks harming the organization and maybe even destroying it. Understanding why this attitude persists is not just an academic exercise; it has tremendous implications for business and society."
Two decades and several trillion wasted dollars later, he finds that people are making the same mistakes. They claim their project is unique, so past lessons don't apply. They underestimate complexity. Managers come out of the gate with unrealistic budgets and timelines. Testing is inadequate or skipped entirely. Vendor promises that are too good to be true are taken at face value. Newer development approaches like DevOps or AI copilots are implemented without proper training or the organizational change necessary to make the most of them.
What's worse, the huge impacts of these missteps on end users aren't fully accounted for. When the Canadian government's Phoenix paycheck system initially failed, for instance, the developers glossed over the protracted financial and emotional distress inflicted on tens of thousands of employees receiving erroneous paychecks; problems persist today, nine years later. Perhaps that's because, as Charette told me recently, IT project managers don't have professional licensing requirements and are rarely, if ever, held legally liable for software debacles.
While medical devices may seem a far cry from giant IT projects, they have a few things in common. As Special Projects Editor Stephen Cass uncovered in this month's The Data, the U.S. Food and Drug Administration recalls on average 20 medical devices per month due to software issues.
Software is as significant as electricity. We would never put up with electricity going out every other day, but we sure as hell have no problem having AWS go down." -Robert N. Charette
Like IT projects, medical devices face fundamental challenges posed by software complexity. Which means that testing, though rigorous and regulated in the medical domain, can't possibly cover every scenario or every line of code. The major difference between failed medical devices and failed IT projects is that a huge amount of liability attaches to the former.
When you're building software for medical devices, there are a lot more standards that have to be met and a lot more concern about the consequences of failure," Charette observes. Because when those things don't work, there's tort law available, which means manufacturers are on the hook. It's much harder to bring a case and win when you're talking about an electronic payroll system."
Whether a software failure is hyperlocal, as when a medical device fails inside your body, or spread across an entire region, like when an airline's ticketing system crashes, organizations need to dig into the root causes and apply those lessons to the next device or IT project if they hope to stop history from repeating itself.
Software is as significant as electricity," Charette says. We would never put up with electricity going out every other day, but we sure as hell have no problem accepting AWS going down or telcos or banks going out." He lets out a heavy sigh worthy of A.A. Milne's Eeyore. People just kind of shrug their shoulders."