Why We Struggle With Failure
What failure really is, why we avoid it, and how to design habits and systems that help teams learn faster.
We struggle with failure because it threatens our identity and sense of control, so we instinctively avoid it. Quality engineering is about designing systems and cultures where small failures become learning opportunities, before they become catastrophic system failures. So what counts as a failure, and why is it so hard to learn from one?
What is failure?
In engineering teams, “failure” is often used loosely to mean bugs, incidents, mistakes, surprises, rework, or simply an error. I find it simplest to define failure as a gap between what we expected and what happened. That gap can come from a mistake (human action), an incident (system outcome), or an experiment result (learning outcome).
When people fail, it typically feels negative. It’s never enjoyable, and over time, it can erode our confidence and self-belief. When failure feels like a threat, we switch from learning mode to self-protection mode. So we try to avoid, deny, distort, or ignore failures to protect ourselves, often using cognitive shortcuts.
Avoid, deny, distort and ignore failure
Self protection
From a young age, we are often implicitly taught that failure is bad. Think about when a child is punished for making a mistake or getting something wrong, which is sometimes necessary for safety.
As adults, we often avoid, deny, distort, and ignore our failures to protect our self-esteem and sense of control. We like being held in high regard, and failure can feel like it threatens that.
Cognitive shortcuts
We also take mental shortcuts. We often blame the situation when it’s our fault, but blame other people’s character traits when it’s their fault. We do this because we know our own circumstances, but we only see other people’s behaviour. This contributes to a bias most of us are susceptible to: the fundamental attribution error.
Another issue is time pressure. Smaller issues don’t seem worth the effort to investigate, and admitting them can feel like a hit to our self-image. It’s usually easier to fix the issue and move on. But that removes the learning from everyone else and makes it more likely that others face the same failure.
Big failures are often made up of smaller issues
Large-scale failures are often made up of small issues that line up in just the right way to lead to bigger problems. Big failures are rarely one big mistake, but often lots of small issues plus normal variability and a system that didn’t make risks visible. The chain of smaller issues is usually only obvious in hindsight. In the moment, they often look like weak signals that are easy to ignore.
We don’t avoid failure because we’re lazy. We avoid it because it hurts, and our brains are trying to keep us safe.
Why do we fail?
Two main types of work challenges
In socio-technical systems, failure often shows up through two broad challenges: technical and interpersonal. A technical challenge is applying or learning a new tool or skill. An interpersonal challenge is learning to work with other people. But in practice, these blend together.
When we face technical challenges, mistakes, misunderstandings, and forgetting are inevitable. With interpersonal challenges, there is always the possibility of conflict. And because of the interplay between the social and technical aspects of software systems, these challenges are further exacerbated.
What we experience day to day is often uncertainty: skill uncertainty (people not knowing how to do something yet), coordination uncertainty (people can’t agree on what good looks like), and system uncertainty (the system behaves in ways we don’t fully understand). All of these can lead to failure.
Why learn from failure
Learning from failure is key, especially when the process (what to do and how to do it) and the outcome are still in flux, as is often the case in software engineering. And because of this flux, failure is much more likely.
That gap between expected and actual results is information. It tells us something about our assumptions, our coordination, or the system itself. It’s an opportunity to learn what happened and possibly why. But this is where one of two things can happen.
Embracing failure
Embracing failure usually occurs when leadership fosters a learning culture within teams. They set up systems and structures that allow people to gather information about what happened, share that information so others can learn from it, and feed it back into their ways of working. This reinforces learning over time.
Embracing failure can look like blameless reviews that focus on contributing factors, small, safe-to-fail experiments, sharing near misses and “we nearly shipped…” stories, and making it easy to report small issues.
Avoiding failure
Avoiding failure usually occurs when leadership believes failure is bad. They might ask someone outside the team to investigate, produce a report, and correct the mistake, or they might ask the people involved what went wrong and tell them not to do it again.
Avoiding failure can look like “who did this?” lines of questioning, investigations done to people rather than with them, fixes shipped quietly with no learning shared, or metrics that reward looking good over getting better. The difference is rarely intent, but often incentives, time pressure, and whether people feel safe being honest.
What do we need to learn from failure?
There are two areas to focus on: the individual and the organisation. At an individual level, it’s about leaning into failure and working with it. At an organisational level, it’s about setting up structures and procedures that help people learn from failure.
The individual
As individuals, it’s about creating a pause between stimulus and response. It’s about noticing and responding deliberately, not just reacting to the situation.
Emotional regulation
When you’re in a tense meeting or receiving critical feedback, notice the spike in your reactions, pause, and ask a clarifying question aimed at understanding their perspective (not defending your stance). This helps you stay curious and constructive rather than shutting down or lashing out.
Try saying: “Can I stop for a second so I respond properly?” or “Can you say a bit more about what you’re seeing?”
Why it helps: Keeps the conversation productive under pressure.
Sensemaking
When you realise you’ve missed something, separate “I missed a detail” from “I’m incompetent”, and focus on what information or assumption was missing. This keeps you in learning mode rather than spiralling or becoming defensive.
Try saying: “I missed X because I was focused on Y. Next time I’ll check Z earlier.”
Why it helps: Turns a mistake into a clear, reusable insight without self-flagellation.
Help-seeking
When you’re stuck for longer than feels comfortable, ask earlier and ask specifically. Share what you’ve tried, where you’re blocked, and what you need from someone else. This reduces rework and uncovers risks sooner.
Try saying: “I’m stuck on X. I’ve tried Y. I’m unsure about Z. Can you help me sanity check?”
Why it helps: Speeds up learning and prevents late surprises.
Emotional regulation, sensemaking, and help-seeking are the skills that help you stay present. Behaviours are how you make learning visible and give others permission to do the same. Behaviours look like naming assumptions out loud, sharing what you tried, asking earlier, and admitting uncertainty without shame.
The organisation
As individuals, there are things we can do, but there are also environmental factors that organisations (via leadership) can design for.
Safety
Make it safe to report failures without fear of punishment. This looks like blameless post-mortems with clear facilitation and follow-up, leaders modelling admissions of mistakes, and clear routes for raising concerns early. All grounded in psychological safety, so people can speak up.
Time
Make time to reflect when failures happen, rather than just moving on. This looks like lightweight retros after incidents, protected time to discuss what happened, and prioritising fixes that reduce repeat failures.
Memory
Make learning reusable by capturing what happened, why it happened, and what will change as a result. This looks like decision records, learnings captured in one place, and actions that are tracked and revisited. Without follow-through, post-mortems become theatre.
A common worry from leaders is that “learning from failure” becomes “anything goes”. The guardrails are good practice and clear standards. The point is earlier visibility, faster learning, and fewer repeat failures because the system makes it safe, gives time, and retains what was learned.
Close
Failure is inevitable in these complex socio-technical systems we work in. Learning from those failures will never be easy. But creating norms around talking about them and supporting people when they do helps improve the system in which they occur leading to better system quality outcomes.
Successful organisations are not the ones with the best people, but the ones that can identify, learn from and correct their mistakes the fastest.
Further reading
Scale of Failure
A practical way to talk about failure without blame, so teams can learn faster and build quality in.
https://qualityeng.substack.com/p/scale-of-failureLearning from failure: spikes, POCs, prototypes and MVPs
Clear definitions, the learning each one enables, and how to use them to create intentional, safe-to-fail learning.
https://qualityeng.substack.com/p/spikes-poc-prototypes-mvpWhy is psychological safety important to software engineering teams?
What psychological safety means, why it matters, and how it helps teams surface issues early and speak up.
https://qualityeng.substack.com/p/why-is-psychological-safety-important


