evmbench
Evaluating AI performance on high-severity contract findings
evmbench is an open benchmark from OpenAI and Paradigm that evaluates whether AI agents can detect, patch, and exploit high-severity vulnerabilities.
This interface focuses on detection and only reports high-severity findings. Upload a contract folder and start a run.