Learning from failure: spikes, POCs, prototypes and MVPs
What each one is for, what learning it gives you, and how they help teams learn from failure
In my last post, I shared a Scale of Failure to help us work with failure more deliberately. The aim was to classify the main types we’re likely to see in engineering teams, and learn from them rather than just react to them.
Toward the bottom of the scale, there are informative failures, things you do deliberately to cause failures, so we can learn. That got me thinking about the approaches we use in engineering teams to create those learning loops on purpose, and what each one is actually for.
The four main types most teams lean on are Spikes, Proofs of Concept (POC), Prototypes, and Minimum Viable Products (MVPs).
But in my experience, teams often use these terms interchangeably, and that can cause real problems. For example, stakeholders might be expecting an MVP that we can test with early adopters, but what the team are actually doing is a spike to understand the technology.
So what do these terms mean, and how do they help us work with failure and uncertainty?
A quick way to think about these approaches is:
Spike: learn what we need to learn to proceed
Proof of Concept (POC): prove technical feasibility in our context
Prototype: test whether users can and will use it as intended
Minimum Viable Product (MVP): validate whether it creates value in the real world
Each one makes a different kind of failure cheap.
Spike: What do we need to learn to reduce a specific uncertainty?
Framing
What: A time-boxed investigation to reduce uncertainty about a technology, language, approach, integration, constraint, etc.
Why: To learn enough to make planning and decisions more predictable.
How it works
How: A dev pair or trio builds a disposable sandboxed example, or goes straight into a Request For Comments (RFC) document.
Output: Knowledge, not a working feature. For example, a decision log, a short write-up, a recommendation, constraints, or a clear “do not proceed”.
When: Whenever a team has uncertainty about a tool or an approach.
Audience: Engineering teams.
Boundaries
What it deliberately does not answer: Will it work under real integration and operating conditions? Will it behave in production? Will users want it?
So what
Advantage: Makes learning cheap when the uncertainty is around capability gaps and process inadequacy in the team (see Scale of Failure to learn more).
Trade-off:
Doesn’t do much to reduce uncertainty around task challenge or system complexity.
Easy to overgeneralise from a sandbox and miss issues that only appear under real integration and operating conditions.
Notes: Ideally, you have a hypothesis and a question you’re trying to answer. In that sense, it’s a form of exploratory testing.
Common misconception: “We tried it in a sandbox, so we’re safe.”
POC: Can we build this, given our constraints?
Framing
What: An investigation into technical feasibility that takes into account your context (architecture, security, performance, device/browser support, data access, deployment constraints, and so on).
Why: To help teams understand if they have the technical feasibility to build a concept that the business is requesting.
How it works
How: Usually built outside the team’s main codebase to avoid contamination and keep it disposable (or at least clearly separated).
Output: A disposable working model plus the key learning. For example, a runnable demo, an experimental branch, integration notes, feasibility constraints, and a clear technical recommendation.
When: When the team has technical uncertainty about whether something is achievable in practice.
Audience: Primarily engineering teams, with stakeholder visibility when it affects scope, timelines, or viability.
Boundaries
What it deliberately does not answer: Is it usable? Does it create value? Is it safe, maintainable, or sustainable long term?
So what
Advantage: Makes failure cheap when the uncertainty is around process inadequacy and task challenge (see Scale of Failure to learn more).
Trade-off: Often does little to reduce risks associated with system complexity, such as production operating risks, cross-team dependencies, long-term maintainability, or scaling.
Common misconception: “It runs, so it’s basically ready.”
Prototype: How will it look, flow, and feel for a user?
Framing
What: An investigation into usability and experience, used to test design assumptions before code makes them expensive.
Why: To help teams answer the question “How will it look and flow?”
How it works
How: Prototypes come in many forms, from paper prototypes and storyboards to slideshows and interactive simulations. They can be UI prototypes, but they can also be prototypes of flows, journeys, or service interactions.
Output: Disposable mock-ups and simulations of the experience. For example a clickable mock, storyboard, annotated flow, or scripted walkthrough.
When: When design and engineering teams want to test user flows, layouts, and assumptions with stakeholders and end users before committing to building.
Audience: Business stakeholders and end users (and the delivery team, because it reduces misalignment).
Boundaries
What it deliberately does not answer: Is it technically feasible? Will it survive real operating conditions? Can we support it sustainably?
So what
Advantage: Makes learning cheap when the uncertainty is around user behaviour, design assumptions, and whether the experience makes sense to real people. This is also a great place for exploratory testing with users.
Trade-off: Technical uncertainties remain, especially feasibility and complexity (process inadequacy, task challenge, system complexity).
Common misconception: “You can click things, so building it is easy.”
MVP: Should we build this, based on real evidence?
Framing
What: A minimal but real product or capability used to validate whether something has core user value in the real world.
Why: To help teams answer the question “Should we build it?”
How it works
How: A functional minimal slice with instrumentation that allows learning with real users in a real usage environment.
Output: A real feature or experience with measures. Often built within the team’s codebase, following the team’s standards, and designed to be evolvable (even if it later gets deleted). For example, an instrumented feature in production, or at least in a real usage environment, with clear success metrics.
When: When teams want evidence about user needs and market demand, and want to learn from real usage and feedback.
Audience: Early adopters.
Boundaries
What it deliberately does not answer: It doesn’t automatically solve long-term sustainability. If it “works”, you still need to decide what foundations you’ll invest in next.
So what
Advantage: Primarily a business hypothesis testing approach, but it can also reduce uncertainty across capability gaps, process inadequacy, task challenge, and system complexity because it touches real operating conditions. It can also enable exploratory testing of design and user experience with real users.
Trade-off:
If standards are skipped for speed, you can accidentally create a “POC in production”, which turns short-term learning into long-term technical debt.
Minimal scope should not mean minimal quality standards. “Viable” means safe enough to run, learn, and, if required, decommission.
Common misconception: “Small means low quality standards.”
An interesting outcome of this is that the earlier the artefact, the cheaper the learning, but the narrower the risks it can reveal. The closer you get to real users and real operating conditions, the more expensive the learning, but the more of the failure surface you can actually see.
Where teams get into trouble is when spikes, POC, and prototypes get used interchangeably. That creates an expectation gap around what the outcome will be. Someone outside day-to-day engineering work can easily hear “we’ve built a prototype/POC” and assume it’s basically an MVP, or that it’s ready to deploy.
If you want to maximise learning and reduce risk, it helps to be deliberate about which question you are trying to answer.
A common sequence (when you’re uncertain across multiple dimensions) is:
Use a Spike to learn about what you need to learn
Use a POC to prove technical feasibility in your context
Use a Prototype to refine the experience and flows
Use an MVP to validate value with real users
It’s not always linear, and you won’t always need all four. The point is to use the smallest artefact that answers the question you’re stuck on, name it clearly, and make sure everyone understands what learning you’re aiming for.
Because naming the work properly doesn’t just reduce technical risk. It reduces the expectation gap between what people think they’re getting and what the team is actually trying to learn. And it stops us accidentally promising progress when what we’re really doing is learning.
How do we learn from failure?
So next time you hear a team say “we’re doing a spike/POC/prototype/MVP”, do a quick sense check:
What learning (uncertainty reduction) are we trying to get from this?
Why is this the right approach to get that learning?
How will we capture and share what we learn?
What decision or next step will this unlock once we have it?
And if the learning doesn’t match the approach, gently nudge it back to the thing that does.
Over time, that small habit helps teams use the term that matches the learning outcome. It reduces expectation gaps and helps us create more informative failures instead of failing and hoping we learn.


