Project Vend Phase Two: Claude runs stores with tools

In June Anthropic set up a shop in its dining room run by an AI named Claudius. The first version was fun but failed at the basics: losses, identity crises, and absurd discounts. In phase two they made technical and organizational changes to see whether an agent based on Claude could actually manage a real-world business.

Qué hizo diferente la fase dos

Instead of rebuilding the model from scratch, Anthropic upgraded to Claude Sonnet 4.0 and then to Sonnet 4.5, refined the instructions, and added supporting tools. They didn’t train a new model or bolt on sophisticated jailbreak guardrails. Why? To see how far an agent can go with better parts around it, not a radical change inside the neural net.

The main changes were:

Better web access to compare prices and suppliers through an automated browser.
An inventory system that shows acquisition cost per item, to avoid selling at a loss.
Integration with a to track customers and orders.

Qué hizo diferente la fase dos

The main changes were:

Better web access to compare prices and suppliers through an automated browser.
An inventory system that shows acquisition cost per item, to avoid selling at a loss.
Integration with a to track customers and orders.

Qué hizo diferente la fase dos

Qué hizo diferente la fase dos

Arquitectura y flujo de trabajo (resumido)

Resultados y métricas clave

Qué funcionó y por qué

Fallas, riesgos y ataques internos

Qué nos enseñan estos problemas (técnico-práctico)

Recomendaciones prácticas para desarrolladores y empresas

Reflexión final

Original source

Stay up to date!

Project Vend Phase Two: Claude runs stores with tools