How software systems fail - Part 2c - Processes
Part 2c will look at how we can use 6 characteristics of how complex systems fail to improve our understanding of how quality is lost at the process level of software systems.
Key insight
Failures are bound to happen in complex software systems. Trying to avoid, ignore, delay, or blame others will only make the failures worse. Software engineering teams need to be familiar with process failures to identify, contain, and recover from them calmly and efficiently. Instead of viewing failures as something to be avoided at all costs, they should see them as valuable learning opportunities to gain a deeper understanding of their systems' behaviour.
Three key takeaways
1. As builders, maintainers, and operators of complex software systems, we need to become intimately familiar with failure to help diagnose when it occurs, limit its impact, and restore systems to acceptable performance standards.
2. Root cause analysis, hindsight, confirmation, and availability biases can limit investigations into failures and need to be accounted for when conducting postmortems.
3. Improving future processes is not about blocking "human error" but about understanding why people did …
Keep reading with a 7-day free trial
Subscribe to Quality Engineering Newsletter to keep reading this post and get 7 days of free access to the full post archives.