Claude powers scientific computing with long-running agents

A few years ago, the idea of letting an AI agent work on its own for days sounded like science fiction. Today you can specify a clear goal, give context, a set of rules, and watch a team of agents perform complex numerical work while you only supervise occasionally. Sounds like magic? It's project management supercharged by models capable of long-range tasks.

Qué es un flujo de trabajo "long-running" para ciencia

Anthropic describes how to move from a short conversational loop to a workflow where an agent operates autonomously for days: initial planning, persistent memory, test oracles, and orchestration patterns. This lets you compress months of human work into days for well-bounded tasks: rewriting legacy code, reimplementing a numerical solver, or debugging a large codebase against a reference.

In the technical example, they use Claude Opus 4.6 with Claude Code to implement a differentiable version of a cosmological Boltzmann solver. That solver evolves coupled equations for photons, baryons, neutrinos, and dark matter, and its output is compared to data like Planck's. Making it differentiable in JAX opens the door to gradient-based inference, speeding parameter estimation dramatically.

Qué es un flujo de trabajo "long-running" para ciencia

Por qué tiene sentido este enfoque

Componentes prácticos del flujo de trabajo

1) CLAUDE.md: la especificación viva

2) CHANGELOG.md: memory for the long run

3) Test oracle: the project's compass

4) Git as light coordination

5) Orchestration on HPC with SLURM and tmux

Patrones de orquestación útiles: el Ralph loop y variantes

Resultados y limitaciones observadas

Implicaciones prácticas y éticas

Fuente original

Stay up to date!

Claude powers scientific computing with long-running agents