RiskRubric: standardizes AI risk assessment

Sep 17, 20253 minutes

Imagine you step into the huge model storefront on Hugging Face and ask yourself: how do I know which one is safe for my project? Now there’s a tool that answers that question with numbers and clear recommendations. (huggingface.co)

What RiskRubric is and why it matters

RiskRubric.ai is an initiative led by Cloud Security Alliance and Noma Security, with contributions from Haize Labs and Harmonic Security, that aims to give standardized risk scores to AI models. The idea is simple: evaluate models consistently across six pillars so any developer or organization can compare and decide with data. (huggingface.co)

Sound useful? Think of this as the safety label for models; you no longer depend only on vague descriptions or the author's reputation.

How they evaluate a model

RiskRubric runs an automatic, reproducible set of tests that include:

1,000+ reliability tests to check consistency and handling of edge cases.
200+ adversarial probes to detect jailbreaks and prompt injections.
Automated code scanning and documented review of training data.
Privacy assessment that looks for data retention or leakage.
Structured tests to evaluate harmful content and other security risks. (huggingface.co)

The result is 0-100 scores for each pillar and a cumulative letter grade A-F, along with concrete vulnerabilities and mitigation recommendations. Plus, the platform lets you filter models by what matters to you: privacy, reliability, security, etc. (huggingface.co)

What they found up to September 2025

By evaluating open and closed models with the same criteria, RiskRubric revealed some interesting trends. For example, many open models outperform closed ones in dimensions like transparency, where open development helps. Total scores ranged from 47 to 94, with a median of 81. Fifty-four percent of models are at level A or B, but there’s a long tail of models with medium or low scores that could be attractive targets for attackers. (huggingface.co)

What does this mean for you? Don’t treat an average score as a synonym for safety. If you’re deploying a model in production, consider setting a minimum threshold (for example 75) and prioritize those that already have concrete mitigations.

Practical lessons and trade-offs

One key observation is that security and transparency sometimes pull in different directions. Stricter guardrails can make a model seem opaque if responses are denied without explanation. The recommendation is to combine robust controls with mechanisms that explain why a request was denied and provenance signals. That way you keep security without losing user trust. (huggingface.co)

Also, improving your security posture tends to reduce both safety and harm risks; well-designed defenses often benefit multiple pillars at once.

How you can participate today

If you have a model, you can request an evaluation or suggest existing models to be added to the assessment. The initiative publishes results and roadmaps so the community can collaborate on patches and safer variants. It’s a practical opportunity for developers, security teams, and product owners to work together to raise the bar for everyone. (huggingface.co)

If you want to see the initiative in detail visit RiskRubric.ai or check the original Hugging Face post where the methodology and findings are explained. (huggingface.co)

Final reflection

Democratizing AI security isn’t just a nice idea; it’s necessary if we want adoption to continue without creating more risks. This initiative aims to make security a comparable, actionable feature, not a mystery. Ready to demand transparency and scores when you choose a model? That shifts the conversation from I trust the provider to I can verify and improve.

Stay up to date!

Get AI news, tool launches, and innovative products straight to your inbox. Everything clear and useful.

What RiskRubric is and why it matters

Sound useful? Think of this as the safety label for models; you no longer depend only on vague descriptions or the author's reputation.

How they evaluate a model

RiskRubric runs an automatic, reproducible set of tests that include:

1,000+ reliability tests to check consistency and handling of edge cases.

200+ adversarial probes to detect jailbreaks and prompt injections.

Automated code scanning and documented review of training data.

Privacy assessment that looks for data retention or leakage.

Structured tests to evaluate harmful content and other security risks. (huggingface.co)