Anthropic publishes Claude's new constitution to guide AI

Anthropic has just published Claude's new constitution, a document designed to explain what values and behaviors should guide its model. It's not a list of rigid rules; it's a broad guide that aims to teach Claude why it should act a certain way, not just tell it how to do things.

What the constitution of Claude is

The constitution is a foundational text: it describes Anthropic's vision of who Claude should be and why. It's mainly written for the model itself, to give it context about its situation, priorities, and the reasons behind those priorities.

Why write something like this for an AI? Because Anthropic wants Claude not only to follow instructions, but to understand the reasons behind them and apply judgment in new situations. And publishing it increases transparency: anyone can see which behaviors are intentional and which might be system failures.

How it's used in training

The constitution isn't decorative. Anthropic integrates it into different stages of training: it helps generate synthetic data, conversation examples, responses aligned with the proposed values, and classifications of possible replies.

What the constitution of Claude is

How it's used in training

What the constitution of Claude is

How it's used in training

The shift in approach: principles with reasons, not just rules

Summary of the main priorities

Key contents of the constitution

License and transparency

Limitations and the future

Final reflection

Original source

Stay up to date!

Anthropic publishes Claude's new constitution to guide AI