Mellea 0.4.0 and Granite Libraries: verifiable AI workflows | Keryc
Mellea 0.4.0 arrives alongside three Granite Libraries to make it easier to build generative workflows that are structured, verifiable, and security-aware. What does that mean in practice for your AI project? Less improvisation with prompts and more predictable, maintainable generative programs.
What Mellea 0.4.0 brings and why it matters
Mellea is an open source Python library for writing generative programs where the probabilistic behavior of a prompt gets replaced by structured workflows. Have you felt your LLM pipelines become fragile and hard to debug? Mellea offers concrete solutions: constrained decoding, structured repair loops, and composable pipelines.
In 0.4.0 you'll see key improvements:
Native integration with the Granite Libraries, with a standardized API that uses decoding with constraints to ensure outputs match a schema.
The instruct-validate-repair pattern applied with rejection sampling strategies to detect and correct faulty outputs.
Observability hooks for event-driven callbacks, useful for monitoring and traceability in production.
Technically, that means fewer unexpected model responses and stronger guarantees around format and validity. Sound useful for systems that require compliance or structured answers? Exactly.
What the Granite Libraries are and how you use them
A Granite Library is a set of specialized adapters (LoRA adapters) that perform well-defined operations on parts of an input string or conversation. Instead of relying on broad prompts, each adapter specializes: query rewriting, hallucination detection, policy validation, and more.
Practical advantages:
Better accuracy on targeted tasks without retraining the base model.
Modest parameter cost thanks to LoRA, compared to full fine-tuning.
Interchangeable modules that plug into RAG pipelines and agents.
The three libraries released today for the granite-4.0-micro model are:
granitelib-core-r1.0: focused on requirement validation within the instruct-validate-repair loop.
granitelib-rag-r1.0: covers tasks in RAG agent pipelines, including pre-retrieval, post-retrieval, and post-generation.
granitelib-guardian-r1.0: models specialized in security, factuality, and policy compliance.
Example of a concrete flow
Imagine a customer support system with document retrieval (RAG). With Mellea + Granite:
granitelib-rag prepares and rewrites the query to improve retrieval.
You retrieve documents and generate a base response.
granitelib-core validates that the response matches an expected JSON schema (required fields, formats).
If validation fails, the instruct-validate-repair loop kicks in with rejection sampling to regenerate until it passes verification.
granitelib-guardian runs final security and factuality checks before sending the response.
With observability enabled, each step emits events and metrics: latency, number of rejections, reasons for repair, etc. Very useful for audits and debugging.
Technical considerations for teams
Constrained decoding: ideal when you need outputs in a concrete format (JSON, XML, etc.). It reduces the need for defensive parsers.
Rejection sampling and repair loops: they improve robustness, but increase latency. Use them where accuracy matters more than response time.
LoRA adapters vs fine-tuning: adapters are quick to train and lightweight; if your pipeline needs task-specific specialization without sacrificing the base model’s general capabilities, LoRA is a good choice.
Observability: event hooks let you integrate with logging systems and APM. I recommend instrumenting rejection metrics, repair attempts, and output confidence.
Best practices and use cases
Regulated systems or ones that must comply with policies: combine granitelib-guardian with strict validations and event logging for auditability.
Advanced RAG pipelines: use granitelib-rag to improve retrieval and filtering before and after generation.
Experiment with adapter granularity: splitting responsibilities (for example, one adapter only for hallucination detection) makes testing and deployment easier.
Final thoughts
With Mellea 0.4.0 and the Granite Libraries you get practical tools to move LLM systems from fragile prototypes to maintainable, verifiable flows. It’s not magic: it’s engineering applied to generative models. If you run models in production, these pieces help reduce risk and gain traceability without sacrificing capability.