When the crash disappears: control in 5 models

Jun 8, 2026Keryc Díaz5 minutes

I told a story I liked: a woodlands legend called "Run on Oona's Hoard" that rewrote a 1929 bank run as folklore. An owl who guards honey read the panic, started liquidating, and the honey price plunged from 10 to 3 in a few turns. Who programmed that directly? No one.

The thesis was simple: give a small model a role and a budget, and market behavior emerges on its own.

The experiment and the reconstruction

I reconstructed the wood with one key change. Instead of a single model controlling five creatures, I put together a council made of five different models: an OpenAI model, an NVIDIA model, an OpenBMB model, and two copies of a half-billion-parameter model I fine-tuned myself. The idea was honesty: if you're going to claim small models can sustain a living economy, let five distinct architectures make distinct decisions — not one model wearing five hats.

I also reworked the operator. In the new version the player is a shadowy financier: takes a short position, whispers a true tip to prime the fall, activates the legend and cashes out when the price crashes. I made the loop visible on-screen: objective, scoreboard and a first order with one click. Making the promise visible is the fastest way to discover the promise is false.

When I tried the move — short the honey and trigger Run on Oona's Hoard — the honey didn't crash. It rose. The council models, reading the rumor that the vault was empty and a bad harvest signal, didn't sell: they hoarded. Scarcity, not panic. The short lost money and the narrative I had written without irony was that the bet had gone sour.

Why the emergence failed

The technical lesson is clear: in an economy of agents, the reference price is not a dial you turn from outside. It is the residue of what agents actually choose to trade. The original crash was real, but it relied on the disposition of a single model; it wasn't a robust system property. Change the population and the emergent behavior can evaporate.

I tried three ways to force the fall, all failed live:

I left the legend as pure rumor and trusted the agents would react. They didn't sell.
I doused each creature with a windfall of honey, thinking the glut would collapse demand. It worked with my test policy mechanic (a quick stand-in) because that policy obeys rigid desire thresholds, but real models ignored the flood and acted on their own read.
I enlarged the short position, which only increased the loss.

Three live runs, three losses: -15, -26 and -27 pebbles. Every lever I moved was just an input to agents' decisions, and the agents were free to reject them. You can't steer a heterogeneous population with a mechanical shock: the shock only tilts a choice that remains theirs.

There is also an editorial trap worth naming: the cheap stand-in that lets you iterate fast is also the one most likely to flatter the wrong solution. When the test policy and the real agents diverge, the liar is the stand-in.

Attempt	Mechanism	Honey at settlement	Gambit P&L
Original, one model	that model chose to sell	10 to 3	showcase win
Council, rumor only	five models chose to hoard	rose due to scarcity	-15
Council, inventory glut	demand collapse, only `test policy`	almost no change	-26 to -27
Council, override in settlement	price halved by fiat	crashes post-cleanup	+40

Table 1. The same play in four worlds. The crash was emergent and fragile under one model, absent with a heterogeneous council, and reliable only when I authorized it in the settlement.

The solution: authorship in the `settlement seam`

I stopped trying to convince the agents and moved to making the run true by construction. A bank run is, by definition, a crash. So the legend now applies the drop in settlement: after the market finishes clearing each turn, I overwrite the reference price and the commodity devalues by half. Agents can trade and gossip all they want; then the run falls as fact and the short who anticipated it collects their gain.

That sounds like conceding to the emergence, but it isn't. The emergent layer — the five models negotiating, gossiping, hoarding, nursing grudges — still makes the wood feel alive. What I learned is practical: you don't get reliable results by pushing harder on the inputs of the emergence. You get reliable results by precisely choosing the seam where you impose a deterministic override, and leaving everything upstream free. Emergence for texture, authorship for the moments that must occur. The craft is knowing which is which and where the seam lies.

Technical implications and recommendations

Heterogeneity matters: validating an emergent behavior with a single agent is anecdotal. Test different populations and varied architectures before you declare a system-wide property.
Don't trust stand-ins blindly: a test policy that obeys rigid rules will give you speed, but also false positives. If the replication only happens with the stand-in, the finding isn't reliable.
Authorship in the settlement: if you need deterministic outcomes (liquidations, systemic losses, regulatory effects), implement the downstream change explicitly. That preserves the emergent richness upstream and guarantees critical conditions are met.
Control vs. texture: think of your game/market architecture in layers. Use agents for texture and experiential robustness. Use deterministic seams for invariants the system must uphold under any mix of agents.

Final reflection

The moral isn't technophobia or a rejection of emergence. It's methodological humility. Emergence is fascinating and useful, but contingent. If you treat an emergent run in a demo as a law of nature, you expose yourself to bitter surprises when the agent population changes or when you move from a stand-in to live models.

I've made these same mistakes at larger scales and with higher stakes. It was productive to be wrong again in a forest where the only things at risk were a few pebbles and a story I had told with too much confidence. Small models, big adventures, and sometimes a crash you have to authorize yourself so the story ends the way you want.

Original source

https://huggingface.co/blog/build-small-hackathon/thousand-token-wood-sim-v3

Stay up to date!

Get AI news, tool launches, and innovative products straight to your inbox. Everything clear and useful.

When the crash disappears: control in 5 models

Jun 8, 2026Keryc Díaz5 minutes

The thesis was simple: give a small model a role and a budget, and market behavior emerges on its own.

The experiment and the reconstruction

Why the emergence failed

I tried three ways to force the fall, all failed live:

I left the legend as pure rumor and trusted the agents would react. They didn't sell.
I doused each creature with a windfall of honey, thinking the glut would collapse demand. It worked with my test policy mechanic (a quick stand-in) because that policy obeys rigid desire thresholds, but real models ignored the flood and acted on their own read.
I enlarged the short position, which only increased the loss.

Attempt	Mechanism	Honey at settlement	Gambit P&L
Original, one model	that model chose to sell	10 to 3	showcase win
Council, rumor only	five models chose to hoard	rose due to scarcity	-15
Council, inventory glut	demand collapse, only `test policy`	almost no change	-26 to -27
Council, override in settlement	price halved by fiat	crashes post-cleanup	+40

Table 1. The same play in four worlds. The crash was emergent and fragile under one model, absent with a heterogeneous council, and reliable only when I authorized it in the settlement.

The solution: authorship in the `settlement seam`

Technical implications and recommendations

Heterogeneity matters: validating an emergent behavior with a single agent is anecdotal. Test different populations and varied architectures before you declare a system-wide property.
Don't trust stand-ins blindly: a test policy that obeys rigid rules will give you speed, but also false positives. If the replication only happens with the stand-in, the finding isn't reliable.
Authorship in the settlement: if you need deterministic outcomes (liquidations, systemic losses, regulatory effects), implement the downstream change explicitly. That preserves the emergent richness upstream and guarantees critical conditions are met.
Control vs. texture: think of your game/market architecture in layers. Use agents for texture and experiential robustness. Use deterministic seams for invariants the system must uphold under any mix of agents.

Final reflection

Original source

https://huggingface.co/blog/build-small-hackathon/thousand-token-wood-sim-v3

Stay up to date!

Get AI news, tool launches, and innovative products straight to your inbox. Everything clear and useful.

The experiment and the reconstruction

Why the emergence failed

The solution: authorship in the settlement seam

Technical implications and recommendations

Final reflection

Original source

Stay up to date!

The experiment and the reconstruction

Why the emergence failed

The solution: authorship in the settlement seam

Technical implications and recommendations

Final reflection

Original source

Stay up to date!

The solution: authorship in the `settlement seam`

The solution: authorship in the `settlement seam`