103 lines
3.3 KiB
ReStructuredText
103 lines
3.3 KiB
ReStructuredText
==================
|
|
fuzzing libtorrent
|
|
==================
|
|
|
|
.. include:: header.rst
|
|
|
|
.. contents:: Table of contents
|
|
:depth: 1
|
|
:backlinks: none
|
|
|
|
overview
|
|
========
|
|
|
|
Libtorrent comes with a set of fuzzers. They are not included in the distribution
|
|
tar ball, instead download the `repository snapshot`_ or clone the repository_.
|
|
|
|
The fuzzers can be found in the `fuzzers` subdirectory and come with a `Jamfile`
|
|
to build them, and a `run.sh` bash script to run them.
|
|
|
|
.. _`repository snapshot`: https://github.com/arvidn/libtorrent/releases
|
|
.. _repository: https://github.com/arvidn/libtorrent
|
|
|
|
building
|
|
--------
|
|
|
|
The fuzzers use clang's libFuzzer, which means they can only be built with clang.
|
|
Clang must be configured in your `user-config.jam`, for example::
|
|
|
|
using clang : 7 : clang++-7 ;
|
|
|
|
When building, you most likely want to stage the resulting binaries into a
|
|
well known location. Invoke `b2` like this::
|
|
|
|
b2 clang stage -j$(nproc)
|
|
|
|
This will build and stage all fuzzers into the `fuzzers/fuzzers` directory.
|
|
|
|
corpus
|
|
------
|
|
|
|
Fuzzers work best if they have a relevant seed corpus of example inputs. You
|
|
can either generate one using `fuzzers/tools/generate_initial_corpus.py` or download
|
|
the `corpus.zip` from the github `releases page`_.
|
|
|
|
To run the script to generate initial corpus, run it with `fuzzers` as the
|
|
current working directory, like this::
|
|
|
|
python tools/generate_initial_corpus.py
|
|
|
|
The corpus should be placed in the `fuzzers` directory, which should also be the
|
|
current working directory when invoking the fuzzer binaries.
|
|
|
|
.. _`releases page`: https://github.com/arvidn/libtorrent/releases
|
|
|
|
running fuzzers
|
|
---------------
|
|
|
|
The `run.sh` script will run all fuzzers in parallel for 48 hours. It can easily
|
|
be tweaked and mostly serve as an example of how to invoke them.
|
|
|
|
large and small fuzzers
|
|
-----------------------
|
|
|
|
Since APIs can have different complexity, fuzz targets will also explore
|
|
code of varying complexity. Some fuzzers cover a very small amount of code
|
|
(e.g. `parse_int`) where other fuzz targets cover very large amount of code and
|
|
can potentially go very deep into call stacks (e.g. `torrent_info`).
|
|
|
|
Small fuzz targets can fairly quickly exhaust all possible code paths and have
|
|
quite limited utility after that, other than as regression tests. When putting
|
|
a lot of CPU into long running fuzzing, it is better spent on large fuzz targets.
|
|
|
|
For this reason, there's another alias in the `Jamfile` to only build and stage
|
|
large fuzz targets. Call `b2` like this::
|
|
|
|
b2 clang stage-large -j$(nproc)
|
|
|
|
fast+slow
|
|
---------
|
|
|
|
When building an initial corpus, it can be useful to quickly build a corpus with
|
|
a large code coverage. To speed up this process, you can build the fuzzers
|
|
without sanitizers, asserts and invariant checks. This won't find as many errors,
|
|
but build a good corpus which can then be run against a fully instrumented
|
|
fuzzer.
|
|
|
|
To build the fuzzers in this "fast" mode, there's a build variant `build_coverage`.
|
|
Invoke `b2` like this::
|
|
|
|
b2 clang stage build_coverage -j$(nproc)
|
|
|
|
For more details on "fast + slow" see `Paul Dreik's talk`_.
|
|
|
|
.. _`Paul Dreik's talk`: https://youtu.be/e_Oc9SkCo5s?t=1679
|
|
|
|
sharing corpora
|
|
---------------
|
|
|
|
Before sharing your fuzz corpus, it should be minimized. There is a script
|
|
called `minimize.sh` which moves `corpus` to `prev-corpus` and copies over
|
|
a minimized set of inputs to a new `corpus` directory.
|
|
|