TDABC for AI: treat tokens, GPUs and agents as activities
AI cost looks new, but its shape is old. A GPU you pay for whether or not it is busy is the same shape as a machine or a team with idle time. A token consumed is a cost driver. An AI-assisted process is an activity that consumes a mix of resources. Time-Driven Activity-Based Costing, developed by Kaplan and Anderson, was built for exactly this: cost a resource at its practical-capacity rate, drive cost with a measurable quantity, and write the process as a short time equation. Applied to AI, it turns a blended cloud bill into a unit cost you can defend.
Start with the GPU as a capacity resource. Its true cost is not the rental sticker alone; it is the full cost of supplying it, the hardware amortised, plus power, cooling and the operations around it. Divide that full cost by the practical capacity the GPU can realistically deliver, around 80 to 85 percent of theoretical, not 100. The result is a practical-capacity cost rate per GPU-second. Industry data puts average enterprise GPU utilisation in the single digits, which means most of what is paid is the cost of unused capacity. TDABC makes that line visible instead of hiding it inside an inflated blended rate.
In activity-based costing, a cost driver is the measurable quantity that causes cost. For AI, the token is the natural driver, with the price per token as its rate, and the call or request as a secondary driver. Output tokens cost more than input tokens because they are generated one at a time, so the rate has two parts. The FinOps community now calls the token the atomic unit of AI; in cost-accounting terms it is simply the activity cost driver, and naming it that way lets you treat AI like any other activity.
A TDABC time equation expresses how much of each resource a transaction consumes, adjusting for what makes transactions differ. For an AI-assisted process the equation mixes units: so many input and output tokens, so many GPU-seconds, so many minutes of human review, plus an allowance for retries when the model gets it wrong. That single line is the cost of one outcome. Roll outcomes up and you can attribute AI cost to a process, a product, a customer or a use case, exactly as ABC has attributed overhead for thirty years.
Once each AI outcome carries a real cost, set it against the value it creates and rank from most to least profitable to serve with AI. The familiar pattern returns: a profitable core, a flat middle, and a tail where the AI quietly gives margin back. That ranked picture is the whale curve, now drawn for AI. It is where the decisions start: which use cases to scale, which to reprice, and which to stop.
THE AI TIME EQUATION, VISUALISED
Illustrative. One AI outcome decomposed into its resource consumption: input and output tokens, GPU-seconds at the practical-capacity rate, human-review minutes, and a retry allowance. The sum is the unit cost.
FinOps tells you where the AI cost landed. TDABC tells you why it occurred, and how much capacity you paid for without using.
Common questions
- Can you apply activity-based costing to AI?
- Yes, and it fits unusually well. AI cost has the same structure ABC was built for: a capacity resource (the GPU) you pay for whether busy or idle, a measurable cost driver (the token or call), and processes that consume a mix of resources. Time-Driven Activity-Based Costing in particular handles AI cleanly, because it costs capacity at a practical-capacity rate and expresses each process as a time equation.
- What is the cost driver for AI?
- The token is the primary cost driver, with the price per token as its rate; the call or request is a secondary driver. Output tokens carry a higher rate than input tokens because they are generated sequentially. Treating the token as a cost driver is what lets you cost AI with the same activity-based logic used for any other resource.
- How does TDABC handle idle GPU cost?
- Directly. TDABC costs a resource at its practical-capacity rate, the full cost of the resource divided by the capacity it can realistically deliver, and reports the unused portion as the cost of unused capacity. With average enterprise GPU utilisation in single digits, that unused-capacity line is large and, under TDABC, visible to management rather than buried in the rate.
- How is this different from AI FinOps?
- FinOps gives you visibility: it meters and tags AI spend so you can see where it landed. TDABC adds the costing logic underneath: why the cost occurred, which activities and drivers caused it, and how much capacity sat idle. Tag-based showback answers where; activity-based costing answers why and how much was wasted. The two are complementary, and the second is what turns spend tracking into profitability.
Put a defensible unit cost on your AI.
We build the model alongside your finance team, so they own it and can update it afterwards.
Take the Profit Check