CyberSecQwen-4B: local model for cyber defense

CyberSecQwen-4B was born from a practical question: can cyber defense use models that run locally, preserve sensitive evidence, and still perform like larger models? The team’s answer was yes — with caveats. Here I explain why it matters, how they trained it, and what it means for your SOC, vulnerability researcher, or critical-infrastructure team.

Why local models matter in cyber defense

Can you imagine pasting a dump of credentials or a suspicious binary into a public API? Don’t do it. In defense, the data is the vulnerability. Sending evidence to an external service can be exactly the leak you’re trying to avoid.

Also, cost per call and air-gapped environments are real constraints. A mid-size SOC processes thousands of alerts daily: outsourcing CVE explanations or mappings to CWE becomes expensive and, sometimes, impossible from isolated networks.

Finally, adversaries automate everything: from generating phishing in dozens of languages to chaining agentic tools. If defense is going to compete, it needs models you can run on your hardware, without sending secrets out.

Metric (CTI-Bench)	CyberSecQwen-4B	Foundation-Sec-Instruct-8B	Δ
CTI-MCQ (2,500 items)	0.5868 ± 0.0029	0.4996	+8.7 pp
CTI-RCM (1,000 CVE→CWE items)	0.6664 ± 0.0023	0.6850	−1.9 pp
Parameters	4 B	8 B	half the size

Model	CTI-RCM (mean ± std)	CTI-MCQ
CyberSecQwen-4B (Qwen base)	0.6664 ± 0.0023	0.5868 ± 0.0029
Gemma4Defense-2B (Gemma base)	0.6754 ± 0.0035	0.6042 ± 0.0090

Problem	Fix
FA2 fails on Gemma-4 with head_dim=512	Fall back to sdpa for global-attention; local-attention keeps using FA2. Result: ~1.6x slower vs Qwen with FA2.
AITER conflict in serving with CyberPal-2.0-20B	Set `VLLM_ROCM_USE_AITER=0` for that particular evaluation.
bitsandbytes not officially supported on ROCm	Not needed thanks to 192 GB HBM; used `paged_adamw_8bit` as optimizer path.
Demo on HF Spaces with ZeroGPU quota	The demo uses HF OAuth so each visitor consumes their own free quota.

Why local models matter in cyber defense

What CyberSecQwen-4B is and what it shows

How it was trained (technical ingredients)

Corpus, licenses and data cleaning

Portability and verifying the recipe

Quick inference example

Limitations and responsible use

Problems encountered and practical fixes

What’s next and how you can participate

Original source

Stay up to date!

CyberSecQwen-4B: local model for cyber defense