
"MDASH, short for multi-mo del agentic scanning h arness, is designed as a model-agnostic system that uses bespoke AI agents for different vulnerability classes to autonomously discover, validate, and prove exploitable defects in complex codebases like Windows."
"Unlike single-model approaches, the harness orchestrates more than 100 specialized AI agents across an ensemble of frontier and distilled models to discover, debate, and prove exploitable bugs end-to-end,"
"It starts with analyzing the source code to build a threat model and attack surface, running specialized "auditor" agents over candidate code paths to flag potential issues, running a second set of "debater" agents that validate the findings, grouping semantically equivalent findings, and then finally proving the existence of the vulnerabilities."
""Disagreement between models is itself a signal: when an auditor flags something as suspect and the debater can't refute it, that finding's posterior credibility goes up," Microsoft explained. "An auditor does not reason like a debater, which does not reason like a prover. Each pipeline stage has its own role, prompt regime, tools, and stop criteria.""
MDASH is a multi-model, model-agnostic AI system for vulnerability discovery and remediation at scale. It uses specialized AI agents for different vulnerability classes to autonomously discover, validate, and prove exploitable defects in complex codebases such as Windows. The system operates as a structured pipeline that ingests a codebase, builds a threat model and attack surface, and runs auditor agents to flag candidate issues. Debater agents validate flagged findings, group semantically equivalent results, and the pipeline then proves the existence of vulnerabilities. It is powered by a configurable panel of models, using SOTA models for reasoning, distilled models for high-volume validation, and a separate SOTA model for independent counterpoint. Model disagreement increases posterior credibility when debaters cannot refute auditor findings.
#ai-security #vulnerability-discovery #agentic-scanning #model-ensembles #secure-software-remediation
Read at The Hacker News
Unable to calculate read time
Collection
[
|
...
]