Table of Contents
This repo aims to be a collection of benchmarks suites for evaluating the precision of EVM code analysis tools.
If you just want to see the reports created as a result of running the various analyzers over the benchmark suites, you can find that here.
It started out as is a fork of Suhabe Bugara's excellent benchmark suite, and this link shows the results of running some tools on this benchmarks as of May 2018.
Another benchmark we add as a git submodule is Trail of Bits (Not So) Smart Contracts.
Reports from running runner/run.py
and runner/report.py
are here.
Since there is a git submodule in this repository clone using the --recurse-submodules
option. For example:
$ git clone --recurse-submodules https://github.com/EthereumAnalysisBenchmarks/evm-analyzer-bench-suites.git
Only If you forget the --recurse-submodules
on the git clone
do the following:
$ git submodule init
Submodule 'benchmarks/Suhabe' (https://github.com/ConsenSys/evm-analyzer-benchmark-suite.git) registered for path 'benchmarks/Suhabe'
Submodule 'benchmarks/nssc' (https://github.com/trailofbits/not-so-smart-contracts.git) registered for path 'benchmarks/nssc'
$ git submodule update
Cloning into '/src/external-vcs/github/EthereumAnalysisBenchmarks/evm-analyzer-bench-suites/benchmarks/Suhabe'...
Cloning into '/src/external-vcs/github/EthereumAnalysisBenchmarks/evm-analyzer-bench-suites/nchmarks/nssc'...
...
If benchmarks change and you want to pull in the new benchmark code, use git submodule update
.
The reports programs are written in Python 3.6 or better. To install dependent Python packages, run:
$ pip install -r requirements.txt
Analysers are not part of project dependencies and they should be installed manually. The reason for this was to make setup not dependent on analysers failures (there might be some) and to make it possible for user to select specific analysers to benchmark, instead of installing all of them.
Below, you will find a list of supported analysers with installation instructions and known bugs that prevents installation or makes analyser unworkable.
Available in PyPi
$ pip install mythril
Available in PyPi
$ pip install manticore
Known bugs:
-
ValueError: not allowed to raise maximum limit
- Description: Latest version in PyPi -
0.2.0
fails during analyser execution - Workaround: Source code in master branch already contains fix. Thus, while the new version for PyPi is not released manticore must be installed manually:
$ git clone https://github.com/trailofbits/manticore.git $ cd manticore/ $ pip install .
- Description: Latest version in PyPi -
-
Installation fails on MacOS
- Description: trailofbits/manticore#1075
- Workaround: n/a. On some systems with MacOS it was possible to successfully install it, therefore try to install at firsts.
We assme the benchmark suite repositories is set up using in git via the --recurse-submodules
switch described above. With this in place, the two Python programs are run in sequence to:
- run an analyzer over a benchmark suite, and
- generate HTML reports for a benchmark suite that we have gathered data for in the previous step
Executes specified benchmark suite. Input arguments:
-s
,--suite
Benchmark suite name. DefaultSuhabe
. Currently supported:Suhabe
,nssc
-a
,--analyser
Analyser to benchmark. If not set all supported analysers will be benchmarked. Currently supported:Mythril
,Manticore
-v
,--verbose
More verbose output; use twice for the most verbose output-t
,--timeout
Maximum time allowed on any single benchmark. Default 7 seconds--files
Print list of files in benchmark and exit
Description:
The first program runner/run.py
takes a number of command-line
arguments; one of them is the name of a benchmark suite. From that it
reads two YAML configuration files for the benchmark. The first YAML
file has information about the benchmark suite: the names of the files
in the benchmarks, whether the benchmark is supposed to succeed or
fail with a vulnerability, and possibly other information. An example
of such a YAML file is
benchconf/Suhabe.yaml. The
other YAML input configuration file is specific to the analyzer. For
Mythril on the Suhabe benchmark, it is called
benchconf/Suhabe-Mythril.yaml
For each new Benchmark suite, these two YAML files will need to exist. The second one you can start out with an empty file.
The output is a YAML file which is stored in the folder
benchdata
with a subfolder under that with the name of the benchmark. For
example the output of run.py
for the Suhabe benchmark suite will be a
file called benchdata/Suhabe/Mythril.yaml
.
Takes the aforementioned data YAML files and creates a HTML report from that. Input arguments:
-s
,--suite
Benchmark suite name. DefaultSuhabe
,
Here is an example of complete report generation using Mythril on the Suhabe benchmark giving Mythril 5 minutes maximum to analyze a single benchmark:
$ python runner/run.py --timeout 300 --suite Suhabe --analyser Mythril
$ python runner/report.py --suite Suhabe
Source code related to analysers is located in runner/analysers/
module. In order to add support of a new analyser:
- Implement new class inherited from
BaseAnalyser
- New class must be imported in
analyser/__init__.py
- Create configuration files with expected output in
benchconf/
Please check existing analysers as an example.
Pull Requests and suggestions and are welcome.
Please create a new issue for ideas discussion.
The wiki has some commentary around the benchmarks.
See also Building an Ethereum security benchmark.