When you think of powerful language models, do you picture a single monster trained on the entire universe of data? FlexOlmo—and now FlexMoRE—show another path: models built as pieces you assemble, letting institutions with sensitive data contribute without sharing that data.
What FlexMoRE offers
FlexMoRE was born inside the Danish Foundation Models (DFM) project as a practical adaptation of the FlexOlmo architecture. The core idea of FlexOlmo is clear and technical: instead of passing every token through a monolithic model, a router sends that token only to a subset of experts that specialize. At inference time, only the selected experts run, not the whole model.
The challenge DFM faced was scale: in FlexOlmo each expert is often the size of a full model. That works if you have few experts, but as more groups (hospitals, universities, companies) contribute, the system becomes too big to run on ordinary machines.
FlexMoRE changes one key assumption: not all experts must be the same size. It keeps some experts at full size, but replaces most with compact versions called . Those adapters approximate what a large expert learned using far fewer parameters. The parameter that defines how much each adapter is reduced is called .
