Anthropic publishes Economic Index with primitives to measure AI | Keryc
Anthropic introduces a finer way to measure how AI is used in the real world: the Economic Index with five "economic primitives" that act as building blocks to track Claude's economic impact. Do you wonder if AI makes people faster at work? What tasks does it favor most? Could it change the nature of certain jobs? This technical report (November 2025 sample) tries to answer that with data and a reproducible approach.
What are the "economic primitives" and why they matter
Anthropic defines five primitives: task complexity, skill level, purpose (work, education, personal), AI autonomy and success. Each conversation in the sample was evaluated by Claude itself to extract these measures.
Task complexity: estimated by the human time needed to complete the task without AI and by whether the conversation groups subtasks.
Skill level: years of education needed to understand the input and the response.
Purpose: distinguishes professional, educational or personal uses.
AI autonomy: how much is delegated to Claude (from collaboration to full delegation).
Success: assessment of whether Claude completed the task correctly.
These primitives act as leading indicators: they let you see not just how often AI is used, but what kinds of tasks, with what success, and what labor implications might arise.
Methodology in brief (technical but clear)
Anthropic analyzes 1 million conversations from Claude.ai (Free, Pro, Max) and 1 million transcripts of traffic from its 1P API. The analysis preserves privacy and uses Claude Sonnet 4.5 as the predominant model in the November 2025 sample.
Important: this is not a controlled experiment with a fixed set of tasks. The data reflect what users brought to Claude, which introduces selection bias (users pick tasks they think will work). Users can also break complex tasks into steps, creating feedback loops that improve effective performance.
Key results: tasks, success and speed
Which tasks does Claude speed up the most? Surprisingly, the more complex ones. Anthropic measures complexity in estimated years of schooling: tasks that require high school level (12 years) show a speedup of ~9x on Claude.ai; college-level tasks (16 years) show a speedup of ~12x. On the API, acceleration was even greater.
And reliability? Not all complex tasks are completed at the same rate: Claude succeeds on college-level tasks about ~66% of the time versus ~70% for below-high-school tasks. Adjusting savings by success probability reduces but does not eliminate the effect: the biggest speed benefits remain associated with tasks demanding higher human capital.
Comparison with METR: while METR suggests 50% success on tasks of ~2 hours for Claude Sonnet 4.5, Anthropic observes 50% at ~3.5 hours on its API and ~19 hours in the Claude.ai sample. Why so much difference? Different methodologies: METR uses a fixed task set; Anthropic observes real tasks where users decompose problems and select tasks with better chances of success.
Practical takeaway: the effective time horizons for AI (how long it can sustain itself on a task) depend both on the benchmark used and on how users interact in the real world.
Countries and uses: education, work and leisure
The global pattern matches an adoption curve: higher GDP per capita countries use Claude more for work and leisure, while lower-income countries use it more for education. That aligns with Microsoft findings and guides initiatives like Anthropic's collaboration with Rwanda and ALX to bring literacy and extended access to Claude Pro.
Occupations: coverage, content and risk of deskilling
Anthropic measures task coverage (what fraction of a job's tasks are done in Claude) and a version adjusted for success and duration. Result: some occupations (e.g., data entry operators and radiologists) are more affected than their raw coverage suggests; others (teachers, software developers) are less affected after adjusting for success.
Also, the tasks Claude covers tend to require more education: an average of 14.4 years versus 13.2 years across the economy. That indicates Claude currently complements or substitutes more highly skilled components within certain jobs.
Potential impacted examples: technical writers, travel agents and teachers could see shifts in the composition of their tasks.
Anthropic computes a first-order deskilling effect if the covered tasks were removed: on average they would displace higher-education tasks. This is not a definitive labor market prediction: labor dynamics and technology evolution can change these effects.
Aggregate impact on productivity
In earlier work, Anthropic estimated that broad AI adoption could raise U.S. labor productivity growth by 1.8 percentage points per year for ten years. Reapplying their model with the primitives and adjusting for success lowers the estimate:
For Claude.ai: ~1.2 percentage points per year (after adjusting for success rate).
For the API (harder tasks): ~1.0 percentage point per year.
Even 1 percentage point additional annual growth matters: it would bring productivity back to late-90s / early-2000s rates. And if models become much more powerful or firms adopt more sophisticated practices, these numbers could rise.
Trends and operational updates
Anthropic confirms three trends observed in 2025:
Concentration: a few tasks account for a significant share of use (top 10 = 24% in November 2025).
Dominance of computational and mathematical tasks: around one-third of conversations in Claude.ai and nearly half of API traffic.
Interaction pattern: augmentation slightly outweighs automation on Claude.ai (52% vs 45%), though automation has slowly increased over the year.
Also, U.S. adoption has become more distributed across states; if this continues, the model projects geographic equalization in 2 to 5 years.
What this means for researchers, firms and policy
For researchers: these primitives are tools to tackle deeper questions about substitution/complementarity, task dynamism and distributional effects.
For firms: measuring not just how much you use AI, but what it does, with what autonomy and with what success, is key to building safe, productive adoption.
For public policy: the data signal where to intervene (education, retraining, equitable access).
Technical points to consider
Complexity measurement includes estimated human time, time with AI and whether multiple sub-tasks appear in a conversation.
Skill level is estimated in years of education from prompts and responses.
AI autonomy is measured on a scale from active collaboration to full delegation.
Task success is the assessment of whether Claude completed the task; that probability feeds the macro estimates.
Final reflection
Anthropic doesn't just provide numbers: it provides a replicable framework to track how AI changes tasks and jobs in the real world. The primitives turn usage observations into actionable signals: they tell you where AI speeds up work, where it fails, and which types of workers might be affected earlier.
What's the bottom line? AI adoption remains uneven and concentrated, but there are clear signs of expansion into more complex and business-focused tasks. If you follow these primitives closely, you'll be better placed to anticipate opportunities and risks in your sector.