evmbench

Evaluating AI performance on high-severity contract findings

evmbench is an open benchmark from OpenAI and Paradigm that evaluates whether AI agents can detect, patch, and exploit high-severity vulnerabilities.

This interface focuses on detection and only reports high-severity findings. Upload a contract folder and start a run.

Model

I have read and agree to the Terms of Service and Privacy Policy.