Are AI-generated CVEs actually growing?

Yes, sharply. The Cloud Security Alliance documented a 2.74× year-over-year increase in CVEs attributed to AI-generated code across Q1 2026. The intra-quarter monthly slope was steeper: 6 AI-attributed CVEs in January, 35 in March. The curve is accelerating, not flattening.

What's the most common AI-introduced vulnerability class?

Per Veracode's State of Software Security 2026, the top three OWASP categories in AI-generated code are broken access control, cryptographic failures, and injection. Injection (SQL, command, XSS) is the most attributable to LLM context-window limits since the taint typically crosses files.

Why is AI code more vulnerable than human-written code?

Three structural reasons. AI generates plausible code, not verified code. AI review of AI code is biased toward the original prediction. AI assistants reason within a context window that holds 1-2 files, missing bugs whose taint flows across 3+. BrassCoders complements the AI by running deterministic scanners that catch the structural cases the LLM misses.

Are mainstream commercial models worse than open ones?

The Cloud Security Alliance research note did not break out CVE attribution by model vendor. Veracode tested multiple LLMs and found vulnerability rates within 10 percentage points of each other (40-50% OWASP rate across major models). The category, not the vendor, drives the curve.

Will the curve continue through 2026?

Most likely yes. Gartner projects 75-90% of enterprise engineers using AI coding assistants by 2028 — up from 14% in early 2024. The detection layer scales with adoption; without a deterministic complement, every additional AI-using engineer adds proportionally to the vulnerability surface.

What can my team do today?

Run a deterministic static-analysis pass on AI-generated code before merge, and hand the findings to your AI assistant for triage. BrassCoders ships 12 scanners in one CLI; pipx install brasscoders, then brasscoders scan, then hand the .brass/ai_instructions.yaml to Claude Code or Cursor with the prompt 'address the critical_issues in order.'

The Q1 2026 AI-Code CVE Reckoning

AI-generated code drove a 2.74× year-over-year increase in attributed CVEs across Q1 2026 — from 6 AI-attributed CVEs in January 2026 to 35 in March alone. The curve is accelerating. The detection layer most teams ship with today predates it.

The numbers in this post come from the Cloud Security Alliance’s 2026 research note, Veracode’s State of Software Security 2026, and USENIX Security 2025. The interpretation is BrassCoders’s: a reading of what the data says about where the deterministic detection layer needs to go.

This is the second piece in BrassCoders’s AI Coding Assistant Blind Spots pillar — the part of the coverage map that explains why the gap is widening.

The Numbers That Changed In Q1 2026

BrassCoders tracks the 2026 AI-CVE curve the Cloud Security Alliance documented: 6 AI-attributed CVEs in January, 35 in March, a 5.8× monthly slope inside a single quarter. The 2.74× year-over-year figure is the milder reading of the same data; the intra-quarter slope is steeper and matters more for capacity planning.

The CSA’s methodology counts CVEs where the post-mortem attributed at least one contributing line of code to an AI coding assistant. The number undercounts. Many CVEs ship without disclosure of how the vulnerable code got written, and many teams using AI assistants do not record the provenance per line. The real curve is steeper than the documented one.

Veracode’s State of Software Security 2026 tested AI-generated code samples across multiple LLMs and found 45% of samples introduce at least one OWASP Top 10 vulnerability on first generation. That rate is the supply side of the curve. Every commit of AI-generated code that goes in unfiltered is a 45% chance of a Top 10 issue entering the codebase.

USENIX Security 2025 contributed the supply-chain dimension. 19.7% of AI-recommended packages do not exist on the relevant registry. Lasso Security demonstrated the attack: a hallucinated huggingface-cli package they registered as a proof-of-concept received over 30,000 downloads from real developer machines before they took it down.

How AI Code Generation Drives CVEs

BrassCoders categorizes the AI-CVE mechanism in three steps. AI generates plausible-but-vulnerable code, AI review of AI code is biased toward the original prediction, and the code ships unreviewed by anything that does not share the AI’s blind spots. The CVE issues when an external researcher or attacker finds the vulnerability in production.

The middle step is the load-bearing one. AI review of AI code is biased toward the original prediction. The model that wrote import fastapi_users_pydantic is the model that, on review, sees no problem with the import. The generative process and the review process draw from the same training distribution, and the bug is plausible to both.

This is not a hypothetical. ACM TOSEM 2026 evaluated Copilot’s AI review against realistic multi-file codebases and reported the LLM “frequently fails to detect critical vulnerabilities including SQL injection, cross-site scripting, and insecure deserialization.” The failure was concentrated in the cases where the taint crossed file boundaries — the same cases the original generation missed.

The OWASP Top 10 Pattern

BrassCoders maps Veracode’s Top 10 distribution across AI-generated samples in 2026 to scanner coverage: broken access control (A01) to a custom auth-pattern analyzer, cryptographic failures (A02) to detect-secrets plus seven custom secret patterns, and injection (A03) to Pyre’s Pysa interprocedural taint. Each of the top three categories has a deterministic detector that runs locally with zero outbound network calls.

Injection is the category most attributable to LLM context-window limits. The sink lives in one file; the source lives in another. The taint flow crosses the file boundary. The LLM, reading either file in isolation, sees normal-looking code. BrassCoders ships Pyre’s Pysa interprocedural taint analyzer to catch this category deterministically, regardless of context window.

Broken access control is the second-most-attributable category. Auth context lives across decorator, middleware, and route handler files. A missing @login_required is invisible to an LLM that does not have the decorator file in context. BrassCoders ships a custom auth-pattern analyzer that walks the route registration and flags routes lacking an auth guard.

Cryptographic failures includes the hardcoded-secret subcategory. AI assistants inline placeholder credentials from training data, the placeholder pattern-matches a real secret format, and the placeholder ships as a literal hardcoded credential. BrassCoders pairs Yelp’s detect-secrets entropy engine with seven custom format-specific patterns to catch this category.

Why Existing Detection Layers Miss These

BrassCoders was built because the 2025-era detection stack does not match the AI-rate code curve. The standard stack — a SAST scanner in CI, optional LLM-based PR review, human reviewers as the final gate — was calibrated for human-rate code production, and the bottleneck moves from generation to review the moment AI assistants enter the loop.

CI-based SAST runs on commit, which is too late. The vulnerability is already in the repo; the question is whether the team notices before a release. Most teams ship despite CI findings because the dashboards show hundreds of unresolved historical findings already.

LLM-based PR review reads the diff with an LLM. The strength is context awareness. The weakness is precision: the same LLM that wrote the bug rarely catches it on review. Teams using this layer report value on style and clarity, less on security-critical bugs.

Human reviewers are the final gate. AI-generated PR volume scales with engineer count and AI tool adoption. Human review capacity does not. The bottleneck creates a permanent backlog where reviewers skim, and skimming has a known failure mode: the one real bug gets dismissed alongside the seven speculative ones BrassCoders documents in Why Claude Code emits 8 findings.

The structural gap is the same one across all three layers. None of them is built to catch the categories AI assistants miss — the cross-file taint, the missing auth guard, the hardcoded credential past the comment boundary, the hallucinated import. The categories are documented; the detection layer is not yet aligned with them.

The Deterministic Complement

BrassCoders is the deterministic complement to AI code review — the static layer that finds the patterns AI assistants miss so the AI assistant can spend its triage budget on the patterns where context matters. The pairing covers more ground than either layer alone.

The math. A typical BrassCoders OSS scan on a Django codebase produces 1,500 raw findings across 12 scanners. The Paid plan’s AI-powered enrichment deduplicates by root cause and reranks by project signature, producing roughly 30 ranked findings. The developer hands the YAML to Claude Code with the prompt: “Read .brass/ai_instructions.yaml. Address the critical_issues in order.” The AI assistant processes the 30 in under two minutes. The human approves the diffs.

The same workflow runs against Cursor, Continue, Aider, and any AI assistant with file-read capability. The hand-off file is plain text. The hand-off prompt is the same across tools.

What To Do About It

If your team ships AI-generated code, the right action sequence is mechanical. Install BrassCoders, run a scan, hand the output to your AI assistant. The OSS core is free and makes zero outbound network calls. The Paid plan at $12/dev/month adds the AI-powered enrichment that turns 1,500 raw findings into roughly 30 ranked ones.

pipx install brasscoders
brasscoders scan /path/to/project
# then in Claude Code or Cursor:
# Read .brass/ai_instructions.yaml in this project.
# Address the critical_issues in order.

The full coverage map of what BrassCoders catches is in the pillar: AI Coding Assistant Blind Spots. Each blind-spot category maps to a citable proof point. Each citation is verifiable in the linked research.

The Q1 2026 curve is the public signal that the detection layer needs to move. The teams that move first ship with fewer regressions than the teams that wait for the curve to be obvious.