Rethinking Build Estimates in the Age of AI: What Enterprises Get Wrong

Every engineering leader in an enterprise right now is having some version of the same conversation. A business partner, a product owner, or a VP leans across the table and says: “You’re using AI now, right? So shouldn’t this take half the time?”

It’s a fair question. And it deserves a better answer than most teams are giving.

The honest answer is: AI does change the velocity equation — but not uniformly, not predictably, and not in the way most business stakeholders imagine. Applying a blanket discount to your estimates because “we have AI now” is a fast path to missed commitments, eroded trust, and engineers who stop flagging risk because they know the number will get cut anyway.

Here’s how I think about evolving estimation in a team that’s genuinely integrating AI into the development practice.

Why Estimation Was Already Broken

Before AI entered the conversation, enterprise estimation had a fundamental problem: it was measuring the wrong thing.

Story points, t-shirt sizes, and planning poker were designed to estimate effort — how hard is this relative to other things we’ve built? Over time, most teams drifted into using them as time proxies. Stakeholders learned to convert points into days. Velocity became a commitment. Sprint capacity became a contract.

The result was a system where engineers padded estimates to protect themselves, business partners haircut those estimates because they expected padding, and everyone pretended the resulting number meant something it didn’t.

AI doesn’t fix this problem. It lands on top of it. If your estimation practice was already shaky, adding an AI multiplier to a broken baseline just makes the noise louder.

The first step to evolving your estimates isn’t introducing an AI factor. It’s being honest about the signal-to-noise ratio in your current estimates.

What AI Actually Changes

When I look at where AI tooling genuinely moves the needle in enterprise development, it’s specific and bounded — not universal.

Where AI compresses time meaningfully:

Boilerplate and scaffolding. Unit test shells, CRUD endpoints, data transfer objects, migration scripts — work that is structurally predictable and well-precedented. An experienced developer with a good AI assistant can generate and validate this work in a fraction of the previous time.
First-draft code for well-understood patterns. If your team has built this type of component before and the requirements are clear, AI can get you to a working first draft faster. The emphasis is on first draft — review and integration time doesn’t disappear.
Documentation and knowledge transfer. Inline documentation, API contracts, onboarding guides. Work that was often skipped or deferred because it was low-reward for developers.
Debugging with clear symptoms. Given a stack trace, a well-scoped error, and a contained codebase, AI can narrow root causes significantly faster than manual investigation.

Where AI does not reliably compress time:

Ambiguous or poorly scoped requirements. AI can generate code quickly against a bad spec. You still end up with the wrong thing, just faster. Garbage in, garbage out applies to AI-assisted development as much as anything else.
Deeply integrated systems work. When a change requires understanding years of accumulated architecture decisions, undocumented dependencies, and tribal knowledge, AI is a useful assistant but not an accelerant. The hard part was never the typing.
Stakeholder alignment and change management. No amount of AI changes how long it takes to get three business units to agree on a data model.
Security and compliance review. In regulated industries — financial services, insurance, healthcare — the review gates don’t compress because the code was written faster.
Novel problems. By definition, AI tooling performs best on patterns that are well-represented in its training. Genuinely new architecture, new integrations, new domains — the AI assistance is marginal.

The Velocity Conversation You Should Be Having

When a business partner asks “why isn’t AI making this faster,” the right response isn’t defensive and it isn’t a blanket promise. It’s a breakdown.

Walk them through the work type composition of the estimate:

What percentage of this build is well-understood, precedented work where AI genuinely accelerates?
What percentage is novel, ambiguous, or integration-heavy where AI is less effective?
What are the fixed-cost elements — review gates, testing cycles, deployment windows, stakeholder sign-off — that don’t compress regardless of how fast the code is written?

In my experience, when you do this breakdown honestly with a business partner, two things happen. First, they start to understand why the AI savings aren’t showing up uniformly. Second, they start asking better questions about how to reshape the work so more of it lands in the “AI-accelerable” bucket — clearer specs, better modular design, reduced integration complexity.

That’s a much more productive conversation than negotiating a percentage discount.

Building an AI-Adjusted Estimation Model

Rather than applying a gut-feel reduction, build a simple model that makes the AI factor explicit and trackable.

Step 1: Categorise work by AI leverage

Tag each story or task at estimation time with a leverage category:

Category	Description	Indicative AI impact
High leverage	Boilerplate, tests, well-defined CRUD, documentation	30–50% effort reduction
Medium leverage	Feature work with clear requirements, known patterns	15–25% effort reduction
Low leverage	Architecture decisions, ambiguous requirements, deep integration	0–10% effort reduction
Fixed cost	Review, testing, deployment, stakeholder sign-off	No reduction

The percentages are illustrative — calibrate them against your team’s actual data after a few sprints.

Step 2: Track actuals against the model

The only way these numbers mean anything is if you close the loop. Track actual time against estimated time, broken down by leverage category. After six to eight sprints, you’ll have real data on what AI is actually doing for your team — not a vendor claim or an industry benchmark, but your team’s number.

Step 3: Share the model, not just the number

When presenting estimates to business partners, show the breakdown. Here’s the total estimate, here’s the work that’s high-leverage for AI, here’s the estimated reduction, here’s the fixed-cost floor. Make the assumptions visible.

This does two things: it gives business partners insight into where they can help create conditions for greater AI leverage (clearer requirements, better scoping), and it protects engineering teams when actual velocity doesn’t hit the theoretical maximum.

The Trust Problem Is More Important Than the Number

In the rush to demonstrate AI value, a lot of engineering teams are making commitments they can’t keep. They’ve taken their old baseline, applied an aggressive AI discount at the request of a leadership team eager to show ROI, and are now delivering against an estimate that was never realistic.

The short-term consequence is a missed sprint. The long-term consequence is that engineers stop trusting the process, stop raising flags, and start gaming the system to hit the number — which is exactly the dynamic that made estimation unreliable in the first place.

The teams that will build lasting credibility with their business partners are the ones that do three things well:

Start with an honest baseline. Don’t apply an AI factor to a padded estimate. Recalibrate your baseline first.
Make the model transparent. Show your work. Let business partners see how AI leverage is being factored in and why certain work doesn’t compress.
Report on actuals. Close the loop every sprint. If your AI factor was too aggressive, say so and adjust. If it was too conservative, say so and adjust. A model that updates on evidence is more valuable than a model that looks confident.

What to Tell Your Business Partners

When the question comes — and it will — here’s the framing I’d suggest:

AI is making our engineers meaningfully faster on certain categories of work. We’re tracking that rigorously and building it into our estimates transparently. Where we can increase AI leverage — through clearer requirements, better modular design, and reduced integration complexity — we will see real savings. Where we can’t, we’re not going to promise savings that won’t materialise.

That’s a harder conversation in the short term. It’s a much better partnership in the long term.

The teams that figure this out now — before they’ve made promises they can’t keep — will be in a fundamentally stronger position as AI tooling matures. The teams that chase the headline number will be explaining missed commitments for the next two years.