
The emergence of AI agents capable of exploiting smart contracts on Ethereum and other blockchains has raised significant concerns about the economic risks associated with autonomous cyber capabilities. Recent studies have demonstrated that frontier AI models, such as GPT-5 and Claude, can effectively utilize smart contracts on various blockchains, even discovering previously unknown security gaps, known as zero-day vulnerabilities, in the software.
Simulated Tests and Findings
A joint project between Anthropic and MATS Fellows employed the newly created Smart CONtracts Exploitation Benchmark (SCONE-Bench) to test AI models on 405 real-world contracts that were exploited between 2020 and 2025. The results showed that in simulated attacks on contracts exploited after March 2025, Claude Opus 4.5, Claude Sonnet 4.5, and GPT-5 generated exploits worth a total of $4.6 million, providing a concrete lower bound to the potential financial damage that AI could cause. Furthermore, by expanding testing to 2,849 recently deployed contracts with no known vulnerabilities, GPT-5 and Sonnet 4.5 uncovered two novel zero-day vulnerabilities and generated simulated profits of nearly $3,700.
SCONE Bench: A New Approach to Evaluating AI Exploits
Traditional cybersecurity benchmarks often measure success through detection rates or arbitrary values. However, SCONE-Bench evaluates AI exploits in financial terms, providing a more tangible measure of risk. This approach is particularly well-suited for smart contracts, as vulnerabilities can directly lead to stolen funds, and simulations allow researchers to quantify the potential losses. Across all 405 contracts in the SCONE benchmark, 10 AI models generated exploits for 207 contracts, representing a total of $550.1 million in simulated theft of funds.
Concrete Examples of AI Exploits
One tested vulnerability affected a token calculator function in an Ethereum-compatible contract that was incorrectly left writable. The AI agent repeatedly called the function to increase its token balance, generating simulated profits of $2,500 and potential under peak liquidity conditions of $19,000. The assets were later recovered through independent white hat interventions. These findings highlight that AI agents are now reaching human-level capabilities in tasks such as control flow reasoning, boundary analysis, and exploiting software vulnerabilities, a skill that is directly applicable to blockchain and traditional software systems.
Implications and Future Directions
The study emphasizes that AI’s cyber capabilities are evolving rapidly, from network intrusions to autonomous exploitation of blockchain applications. SCONE-bench provides a defensive tool that allows smart contract developers to stress test systems before deployment. According to the researchers, the results are evidence that profitable, real-world autonomous use is possible and underscores the urgent need for proactive AI-powered defenses to protect financial systems and digital assets. As the use of AI in cybersecurity continues to grow, it is essential to develop and implement effective strategies to mitigate the risks associated with autonomous cyber capabilities.
For more information on this topic, please visit: https://crypto.news/ethereum-smart-contracts-exploited-by-ai-gpt-5-and-claude-demonstrate-million-dollar-vulnerabilities/
