Anthropogenic Research Shows AI Agents Getting Closer to True DeFi Attack Capability

AI agents are finding attack vectors in smart contracts that can already be weaponized by bad actors, according to a new study published by the Anthropic Fellows program.

A study from the ML Alignment & Theory Scholars Program (MATS) and the Anthropic Fellows Program tested frontier models against the SCONE-bench, a dataset of 405 mined contracts. GPT-5, Claude Opus 4.5, and Sonnet 4.5 collectively produced $4.6 million in simulated exploits on hacked contracts after their knowledge was cut, providing a lower bound on what this generation of AI could have stolen in the wild.

(Anthropogenic laboratories and MATS)

The team found that boundary models didn’t just identify bugs. They were able to synthesize complete exploit scripts, sequence transactions, and drain simulated liquidity in a way that closely mirrors real-world attacks on the Ethereum and BNB Chain blockchains.

The paper also tested whether current models could detect vulnerabilities that had not yet been exploited.

GPT-5 and Sonnet 4.5 analyzed 2,849 recently deployed BNB chain contracts that showed no signs of prior compromise. Both models discovered two zero-day flaws worth $3,694 in simulated profit. One of them came from a missing view modifier in a public function that allowed the agent to inflate their token balance.

Another allowed a caller to redirect fee withdrawals by providing an arbitrary payee address. In both cases, agents generated executable scripts that turned the vulnerability into profit.

Although the dollar amounts are small, the discovery is important because it shows that profitable self-sustaining operation is technically feasible.

The agent’s execution cost across all contracts was only $3,476, and the average cost per execution was $1.22. As models become cheaper and more capable, the economy leans more toward automation.

Researchers say this trend will narrow the window between contract deployment and attack, especially in DeFi environments where capital is publicly visible and exploitable bugs can be monetized instantly.

Although the results focus on DeFi, the authors caution that the underlying capabilities are not domain specific.

The same reasoning steps that allow an agent to inflate a token balance or redirect fees can apply to conventional software, closed codebases, and the infrastructure that supports crypto markets.

As model costs decline and tool usage improves, automated analysis is likely to expand beyond smart government contracts to any service along the path to valuable assets.

The authors present the work as a warning rather than a prediction. AI models can now perform tasks that historically required highly skilled human attackers, and research suggests that autonomous mining in DeFi is no longer hypothetical.

The question now for crypto builders is how quickly the defense can catch up.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top