Attributing AI spend to projects

Your Anthropic invoice arrives on the first of the month. It’s higher than last month — again. Your CFO wants to know which projects drove the increase. You can tell her the total, you can guess the worst offenders, but you can’t show her the receipt. That’s the gap this play closes.

The job

AI spend attribution is the discipline of connecting every dollar of AI cost back to the project, team and outcome it served. Not “we spent £400k on Anthropic last quarter” but “that £400k broke down across these eleven projects, and here’s what it produced.”

It sounds obvious. It’s brutally hard in practice — because the data exists in three or four places that none of them know about each other. The provider knows the API key. Your gateway knows the request. Your engineer knows the project. None of them connect.

Why it matters

When you can’t attribute spend, every conversation about AI ends in handwaving. The CFO can’t capitalise it because she doesn’t know which projects are in development phase. The CTO can’t justify the bill because he can’t tie it to a shipped feature. The engineer who’s burning £4k a month on Opus when Sonnet would do gets no signal — because nobody can see it’s them.

Attribution is the precondition for every other AI conversation. CapEx classification, R&D tax narratives, model-task fit, frustration detection — none of them work until you can answer the simplest question: whose work was this for?

What good looks like

A working AI spend attribution model has three properties.

It covers both faces of AI spend. Developer AI — Claude Code, Cursor, Copilot, Windsurf — is interactive and attributable to a person. Production AI — chatbots, document extractors, video pipelines — is autonomous and attributable to a service. Both feed into the same project ledger. If you only do one, you’re solving half the problem.

It captures from where the data actually lives. Provider billing APIs give you the cost side. A lightweight CLI on developer machines gives you the project context the API can’t see. Inline enforcement — when you need it — gives you the request-by-request view for the services where policy actually has to bite. None of these is sufficient on its own. All three are reconcilable.

It’s resilient to side channels. The moment attribution depends on developers self-reporting their project, it falls apart. Good attribution reads project context from things they’re already doing — ticket IDs in branch names, file paths, repo names, prompt content — and tells them when it’s not confident.

How to do it

Start with the easiest path: provider billing integrations. They give you the bill, broken down by API key, with no agent installation. That’s enough to tell you which keys are expensive — but not which engineers, which projects, or what work was actually done.

Layer the CLI next. It runs on developer machines, detects whichever AI tools they’re using, and sends features back — work category, project match, model used. It doesn’t see source code or prompt content. The CLI is the layer that turns “API key 17 spent £4k” into “Sarah on the Payments team spent £4k, and 86% of it was on the Payments v2 project.”

Save inline enforcement for last. It’s the most operationally invasive of the three options because it sits on the request path. Most teams don’t need it for everything — usage APIs and CLI telemetry get you to ~80% attribution. Reach for inline enforcement on the services and teams where policy actually has to stop something happening, not just observe it.

The reconciliation matters. The same dollar should not be counted three times. Flowstate handles this by treating provider billing as ground truth and using the other paths to add context, not duplicate cost. Build it the same way if you’re rolling your own.

What this unlocks

Once attribution works, the questions get sharp.

“Which engineer’s AI bill is outpacing their salary?”
“Did the Payments v2 project burn £38k of Cursor and produce a shippable increment?”
“Which AI feature is the cheapest to run per customer action — and which is the most expensive?”
“What share of last quarter’s AI spend qualifies for R&D tax relief?”

None of these are answerable without project-level attribution. All of them are normal Tuesday-afternoon questions when you have it.

The pitfall to avoid

Don’t try to perfect attribution before you ship anything. Get the bill broken down by team and project at 80% confidence first. Surface the disagreements as a first-class UI rather than hiding them. Engineers will correct misattribution if you make it easy and they trust the result.

Perfect attribution is a five-year project. 80% attribution, visible and improving, is a six-week project. Ship the second one.