Skip to content
The Flowstate thesis

The most important budget in your business now runs on tokens.

We help you see it clearly.

Scroll
01 · The moment

AI is the fastest-adopted technology in business history.

$300B
Global AI spend by 2027
18 mo
Since work that used to be impossible became routine
All-in
The companies leaning in hardest will pull away

We love this. We are pro-AI, pro-experimentation, pro-employees choosing the tools that make them excellent.

The job ahead is not slowing any of it down. It’s making it legible: to the engineer choosing a model, the CTO sponsoring a rollout, the CFO signing the bill, and the board approving the strategy.

This is the gap Flowstate exists to close.

02 · What the data shows

Eight frontier models. Five hundred tasks. Four runs each.

In April 2026, researchers at Stanford, Michigan, MIT, Google DeepMind and All Hands published the first systematic study of agent token consumption. Their findings are a useful map of how this new workload behaves.

Bai et al., “How Do AI Agents Spend Your Money?” arXiv 2604.22750
Finding 01 of 05

Agentic tasks are a different shape from chat.

Avg. agentic coding task
4.17M
tokens per task
Avg. code chat
3,390
tokens per task
Input-to-output ratio 153 : 1
Finding 02 of 05

Token usage is naturally variable.

30×
Variance across instances of the same task
Variance across repeat runs

Your AI bill is not a single number. It is a distribution.

Finding 03 of 05

Accuracy peaks at moderate spend.

Beyond the second-cheapest cost quartile, additional tokens deliver diminishing returns.

Top two cost quartiles
50%
of all spend
0%
of additional accuracy
Finding 04 of 05

Different models suit different tasks.

On 230 problems every model solved correctly, the heaviest models used 1.5 million more tokens than the leanest.

Heaviest model 2.1M tokens
Mid-tier model 1.3M tokens
Leanest model 0.6M tokens

Matching the model to the work is where the gains are.

Finding 05 of 05

Models can’t predict their own spend reliably.

0.39
Best correlation across eight frontier models

This isn’t a flaw. It’s a property of the workload.

It’s also a planning problem with a measurement solution.

Every one of these findings is a place where good information helps.

03 · What our customers see

The paper measured it on a benchmark. We see the same patterns on customer devices.

Story 01 of 03

“We thought our staff were creating new PowerPoints. They were iterating on the same deck for hours. Each ‘change this word on slide three’ billed as a full document regeneration.”

Once they saw it, the fix was easy and the relationship with the tool got better.

Story 02 of 03

“Everyone was reaching for the biggest model by default. Once they had visibility, they let people keep using whatever they liked, and added a gentle nudge for the lighter model where it was clearly enough.”

Bills dropped. Outcomes didn’t.

Story 03 of 03

“What are these five people doing? They’re always running out of tokens.”

They turned out to be the most productive people in the company. The answer was to raise the budget, not the eyebrow.

In every case, the answer wasn’t to use less AI. It was to use it with eyes open.

04 · Two truths

Two things are true at once.

Truth 01

The vendor layer is the engine.

AI providers are doing exactly what good product companies do: making it easy to use more of what they offer. That’s not a critique, it’s how great products grow. We benefit from it. So do customers.

Truth 02

The customer layer is the dashboard.

The CFO needs to reconcile spend back to projects and cost classes. The CTO needs to understand which workloads are running, on which models, with which outcomes. Employees need to be trusted with the tools they choose. Boards need to see the trajectory.

Cars work better when both exist.

05 · The category

We call this Workforce Engineering.

AI agents are a new kind of workforce. They accrue cost like contractors, deliver value like employees, and are capitalised like software.

Cost like
Contractors

Variable rate, by output. Never fixed.

Value like
Employees

Persistent productivity. Compounding capability.

Capitalised like
Software

ASC 350-40 eligible. Audit-grade evidence required.

Same arc as every previous cost category
Yesterday
Electricity meters
19th century
Then
Telecoms expense management
1980s
Recently
Cloud cost management
2010s
Now
Workforce Engineering
2026 →

The discipline is too large to leave on spreadsheets.

06 · Where we sit

On every surface AI is used.

Mac, Windows, Linux, browser, terminal, IDE, production workload. Every prompt, every token, every model choice attributed to a person, a project and a cost class, in real time.

Three things that compound
Data

Attribution gets better with every customer added. New customers ship in their second week with the benefit of every prior month of training.

Speed

We deploy to production several times a day. The customer-facing surface keeps pace with the model market.

Coverage

Provider-agnostic by design. Every model, every tool, one view. We complement what providers offer; we don’t replace it.

07 · Five things we believe

Where this is going.

01

AI spend will be capitalised within five years.

The accounting profession is already moving. Continuous CapEx classification will become routine, the same way continuous deployment did for code.

02

The buyer-side layer will become standard.

Every cost category this large has eventually grown its own customer-side visibility tools. AI will be no different. Vendors and visibility tools coexist comfortably in every other category. They will here too.

03

The workforce is becoming hybrid permanently.

Contractor-to-agent swaps are already happening. The companies that build a single ledger across humans and agents first will run smaller, cheaper and faster.

04

Engineering finance is the next category.

Sales has Salesforce. Marketing has HubSpot. Finance has NetSuite. Engineering, the largest cost base in most technology companies, has spreadsheets. That changes now.

05

Returns are shifting from labour to capital.

For two centuries, productivity gains accrued primarily to labour. AI flips that. The companies that measure deployment accurately will be the ones that compound the gains.

08 · What you should see

One ledger. Four views.

If you are a CFO

AI spend by project, team and cost class in real time. Audit-grade reconciliation back to the vendor invoice.

If you are a CTO

Which workloads are running, on which models, with what outcome. Live.

If you sit on a board

Your portfolio’s AI trajectory, and where it is creating durable value.

If you are an employee

The freedom to use the tools that make you excellent, knowing the company has your back.

We built it because the future of business runs on AI, and the people deploying it deserve to see it clearly.

The Flowstate thesis