Simbian logo on black background

AI Models Struggle to Defend Against Cyberattacks

A new benchmark reveals that while frontier language models excel at exploitation, they fail to autonomously detect sophisticated attack chains.

Frontier large language models are proficient at finding and exploiting software vulnerabilities, but a new study shows they are currently unable to defend against them without significant assistance.

Simbian Research Lab released its Cyber Defense Benchmark today, testing 11 prominent LLMs on their ability to detect MITRE ATT&CK chains within realistic telemetry. The results indicate a significant gap between offensive and defensive capabilities in artificial intelligence.

None of the tested models earned a passing score. Anthropic’s Claude Opus 4.6 emerged as the top performer, yet it only detected an average of 46% of attack evidence per MITRE tactic. On its strongest tactic, Resource Development, the model scored 63%. However, its performance plummeted to 25% in the Collection category.

The study found that defensive AI tasks are structurally more difficult than offensive ones. While offense has a clear "win state," such as gaining root access, defense requires reasoning across noisy, partial evidence without knowing the total number of malicious events present.

Cost does not linearly correlate with success. While Claude Opus 4.6 found three times more flags than Google Gemini 3 Flash, the investigation cost roughly 100 times more per run. Mid-priced models, including GPT-5 and Gemini 3.1 Pro, plateaued around a 2% detection rate, often ceasing investigations prematurely because the agent believed the task was complete.

Researchers noted that the raw reasoning power of an LLM is only one component of a security solution. To reach enterprise-level accuracy, models require a "harness"—a specialized framework providing organizational context, deterministic retrieval and structured investigation loops.

The benchmark utilized real Windows telemetry, including Sysmon and Security event logs, mutated with randomized hostnames and IP addresses to prevent models from relying on memorized data. This differs from previous industry benchmarks that relied on multiple-choice questions or capture-the-flag puzzles.

The findings suggest that for security operations centers, the integration of an AI model into a sophisticated agentic platform is more critical than the specific model being used.

About the Author

Jesse Jacobs is assistant editor of SecurityToday.com.

Featured

New Products

  • Compact IP Video Intercom

    Viking’s X-205 Series of intercoms provide HD IP video and two-way voice communication - all wrapped up in an attractive compact chassis.

  • Automatic Systems V07

    Automatic Systems V07

    Automatic Systems, an industry-leading manufacturer of pedestrian and vehicle secure entrance control access systems, is pleased to announce the release of its groundbreaking V07 software. The V07 software update is designed specifically to address cybersecurity concerns and will ensure the integrity and confidentiality of Automatic Systems applications. With the new V07 software, updates will be delivered by means of an encrypted file.

  • Mobile Safe Shield

    Mobile Safe Shield

    SafeWood Designs, Inc., a manufacturer of patented bullet resistant products, is excited to announce the launch of the Mobile Safe Shield. The Mobile Safe Shield is a moveable bullet resistant shield that provides protection in the event of an assailant and supplies cover in the event of an active shooter. With a heavy-duty steel frame, quality castor wheels, and bullet resistant core, the Mobile Safe Shield is a perfect addition to any guard station, security desks, courthouses, police stations, schools, office spaces and more. The Mobile Safe Shield is incredibly customizable. Bullet resistant materials are available in UL 752 Levels 1 through 8 and include glass, white board, tack board, veneer, and plastic laminate. Flexibility in bullet resistant materials allows for the Mobile Safe Shield to blend more with current interior décor for a seamless design aesthetic. Optional custom paint colors are also available for the steel frame.