Discovery is cheaper
CopyFail, Big Sleep, AISLE, Claude, Security Copilot, XBOW, and bug-bounty Hackbots show different versions of the same economic shift: more vulnerability hypotheses can be generated and tested per unit of expert time.
The public record is small. The signal is real.
Bugflation is the gap between AI-accelerated vulnerability discovery and the slower human systems that validate, patch, and deploy fixes. We track real advisories, real CVEs, named AI systems, and attribution labels you can audit. CopyFail (CVE-2026-31431) is the case that made the term unavoidable.
AI-assisted systems are starting to show up in accepted disclosure workflows. The right response is not panic or dismissal; it is better intake, faster validation, and patch pipelines that can absorb the new volume.
CopyFail, Big Sleep, AISLE, Claude, Security Copilot, XBOW, and bug-bounty Hackbots show different versions of the same economic shift: more vulnerability hypotheses can be generated and tested per unit of expert time.
Some advisories directly name an AI system. Others rely on an operator write-up plus a public CVE. The index labels that difference instead of hiding it in a single score.
More reports are not automatically more security. Maintainers need reproducible evidence, duplicate detection, exploitability review, and regression tests to turn claims into shipped fixes.
The teams that handle bugflation well will not be the ones that find the most issues. They will be the ones that close the loop from credible report to deployed patch fastest.
These are not model leaderboards. They are evidence profiles for systems that appear in public vulnerability-discovery workflows.
The first public AI vulnerability-research agent with accepted real-world findings across SQLite, Chrome V8, and Apple WebKit.
Microsoft's multi-model agentic scanning harness, credited by Microsoft with 16 public CVEs across Windows networking and authentication code.
An autonomous security analyzer with a sustained OpenSSL disclosure record and a FreeBSD core advisory batch spanning dhclient RCE, dhclient memory corruption, and libnv stack corruption.
Public Claude-assisted disclosure credits outside the Mythos-only record, including Firefox, FreeBSD follow-ups, NGINX, wolfSSL, and Apache ActiveMQ.
Anthropic's restricted cyber-capable frontier model, publicly tied to FreeBSD RCE, Firefox 150 hardening, and Project Glasswing.
The AI-assisted vulnerability research system credited in CopyFail, with a broader public tracker spanning CVE-backed and embargoed findings.
Xint public bug tracker 50 Xint tracker findings
Bynario's AI-driven vulnerability-research pipeline, with direct Apple and Linux upstream credits across binary analysis, kernel discovery, validation, and patching.
Zellic's agentic security platform, now publicly tied to Fragnesia, CVE-2026-46300, a Linux kernel page-cache local privilege escalation.
Striga's AI-based source-code auditing platform, with public CVE credits and research write-ups across Apache httpd, Tomcat, Ollama, axios, and Mattermost Desktop.
ZeroPath's AI-native SAST and security-research workflow, with public CVE-backed and upstream-patched findings across ProFTPD, Spinnaker, better-auth, FFmpeg, sudo, and other open-source projects.
An autonomous AI-driven penetration-testing platform with public bug-bounty milestones and self-reported Microsoft critical RCE credits.
LLM-enhanced fuzz-target generation and triage inside Google's OSS-Fuzz ecosystem.
Microsoft's AI security assistant, publicly tied to a GRUB2, U-Boot, and Barebox bootloader vulnerability campaign.
OpenAI's agentic security researcher, now surfaced as Codex Security with public OSS CVE examples.
The policy layer around AI-assisted vulnerability discovery: human-in-the-loop rules, accountable operators, and bounty eligibility.
Evidence index is editorial: direct upstream credits score higher than self-reported attribution. It is not a model capability benchmark. Read the methodology.
Every finding page includes an attribution label and references. Direct upstream credits and self-reported AI attribution are intentionally kept separate.
V12's public PoC and write-up say Fragnesia, CVE-2026-46300, was discovered with V12 by William Bowling and the V12 team; distro trackers and kernel patch mail corroborate the Linux XFRM ESP-in-TCP local-root vulnerability.
Direct source attributionMicrosoft says its multi-model agentic scanning harness, codename MDASH, helped researchers find 16 CVEs across Windows networking and authentication code, including four Critical remote code execution flaws.
Direct source attributionBynario says its LLM-driven pipeline discovered, validated, and patched CVE-2026-31532, a Linux kernel CAN raw socket use-after-free; the upstream Linux commit includes Assisted-by: Bynario AI.
Direct source attributionZeroPath Research disclosed an Apache NiFi authorization flaw where users without EXECUTE_CODE can run code through TinkerpopClientService when optional graph extensions are installed.
Self-reported attributionStriga says an open-weights model scan costing under $100 surfaced the Apache HTTP Server 2.4.66 mod_http2 double-free behind CVE-2026-23918; Apache credits Bartlomiej Dmitruk, striga.ai, and Stanislaw Strzalkowski, isec.pl, as finders.
Self-reported attributionLLM-generated fuzz targets produce a 26-vulnerability OSS-Fuzz milestone, anchored by OpenSSL.
Nov 2024Project Zero describes an exploitable SQLite stack buffer underflow found before release.
Feb 2025The Hackbots policy puts human validation, scope compliance, and operator accountability on record.
Mar 2025Microsoft describes a 20-CVE GRUB2, U-Boot, and Barebox campaign accelerated by Security Copilot.
Jul 2025CVE-2025-6965 becomes the clearest public case of AI-assisted discovery paired with threat intelligence.
Jan 2026AISLE reports 20 OpenSSL CVEs across three coordinated releases, with accepted fixes on many entries.
Mar 2026Anthropic and Mozilla document Firefox 148 findings, with Mozilla advisories crediting Claude across CVEs.
Mar 2026Microsoft/NVD records confirm critical CVEs; XBOW supplies the AI-attribution claim.
Apr 2026Xint Code scales a Taeyang Lee AF_ALG/page-cache insight across Linux crypto and surfaces CVE-2026-31431.
Apr 2026ZeroPath Research adds a self-reported AI-assisted trail for three CVE-backed RCE or RCE-adjacent findings across deployment and FTP infrastructure.
Apr 2026FreeBSD credits AISLE Research Team for dhclient RCE, dhclient heap corruption, and libnv stack corruption advisories in the April 29 release.
May 2026Striga says an open-weights scan costing under $100 surfaced CVE-2026-23918; Apache credits Dmitruk/striga.ai and Strzalkowski/isec.pl.
May 2026Linux commits for CVE-2026-31532 and CVE-2026-31694 carry Assisted-by: Bynario AI; the CAN write-up details discovery, validation, and patching.
DirtyFrag and Copy Fail2 are not new AI-attributed findings, but they are important CopyFail-adjacent evidence: Linux still has dangerous seams where zero-copy networking, page-cache provenance, and in-place crypto meet.
The AI vulnerability-discovery record is still small, but direct credits now span browsers, kernels, bootloaders, crypto libraries, and OSS tooling.
CVE-2026-31431 shows the bugflation pattern: expert framing plus AI-assisted subsystem review made a kernel root bug cheap to surface.
AI-assisted reports should arrive with affected versions, input, expected result, actual result, and a minimized proof path. Reject vibes, reward evidence.
Separate upstream credit, self-reported tool usage, and secondary reporting. The distinction makes the data stronger, not weaker.
More submissions mean more validation work. Invest in duplicate detection, maintainer playbooks, and regression tests before the queue spikes.
Run guided variant analysis after every serious fix. Attackers will search nearby code. Defenders should search it first.