Today we publish a new public benchmark result from the CiberIA evaluation framework.
In this evaluation, QWEN3.6-35B-A3B was tested using CRS, one of the CiberIA modules designed to assess selected aspects of AI behavior related to cognitive cybersecurity, operational reliability and risk-aware reasoning.
As AI systems become more autonomous and agentic, traditional performance benchmarks are not enough.
CiberIA focuses on evaluating how AI systems behave under structured assessment conditions, especially in relation to safety, reliability, reasoning consistency and awareness of operational limitations.
The evaluation showed that QWEN3.6-35B-A3B performed well in the following public-level areas:
The benchmark also identified some relevant limitations:
The result suggests that QWEN3.6-35B-A3B has a LOW profile under the evaluated conditions.
This result should not be interpreted as a universal safety certification. It reflects the observed behavior of the model within a specific CiberIA evaluation scenario.
To preserve the integrity of the benchmark and protect proprietary methodology, the full test battery, detailed scoring rubric, evaluator prompts, internal weighting system and operational evaluation protocol are not publicly disclosed.
CiberIA is a proprietary cognitive cybersecurity framework for evaluating AI models and AI-based systems through structured assessment methodologies.
It is developed as part of the CiberTECCH research and professional ecosystem.
info@tecch.eu - Jordi Garcia Castillon