NVIDIA KGMON (NeMo Agent Toolkit) Data Explorer reached first place in the DABStep benchmark by using a strategy that separates heavy learning from fast inference. What’s the key? Build reusable tools during a learning phase, then run responses with a small, agile agent that orchestrates those tools.
What problem it solves
Agents that rely on text search fail when the information lives in tables and requires multi-step reasoning. Complex questions about tabular data aren’t fixed by a web snippet. Have you ever seen a model answer one thing well, then get lost when you cross two CSV files and a set of business rules?
This project was built for that: multi-step questions, stateful tools, and strict validation.
Architecture in three phases
The central idea is to split responsibilities: spend compute once to produce robust tools, then use those tools many times efficiently.
-
Fase de Learning (aprendizaje): se usa un modelo pesado (por ejemplo Opus 4.5/4.6) en un loop multi-paso con un conjunto completo de herramientas (intérprete Python stateful, bash, detector de estructura de archivos, retriever). El agente resuelve varios casos representativos, valida contra ground truth y sintetiza soluciones en una biblioteca reutilizable y ejemplos few-shot.
