libafl should be a better option but due to my familarity with C++, this is a temporary option.
A quick guide to how to use the plugin:
Right now I only have tested on Ubuntu 20.04 and 22.04.
First, install dependencies:
apt-get update
apt-get install -y lsb-release wget software-properties-common gnupg
wget https://apt.llvm.org/llvm.sh && chmod +x llvm.sh && ./llvm.sh 12
apt-get install -y cmake libc++-12-dev libc++abi-12-dev libunwind-12 python3-minimal python-is-python3 zlib1g-dev git gdb joe
apt-get install -y libz3-dev libgoogle-perftools-dev
Next, download and build AFL++:
git clone --depth=1 https://github.com/AFLplusplus/AFLplusplus /workdir/aflpp
ENV LLVM_CONFIG=llvm-config-12
cd /workdir/aflpp && CC=clang-12 CXX=clang++-12 make source-only && make install
Next, download symsan and build
git clone https://github.com/R-Fuzz/symsan /workdir/symsan
cd symsan/ && mkdir -p build && \
cd build && CC=clang-12 CXX=clang++-12 cmake -DAFLPP_PATH=/workdir/aflpp ../ && \
make -j && make install
You need to compile two binaries: one for normal fuzzing and one for symbolic tracing
For the normal fuzzing binary, you can set AFL_LLVM_CMPLOG=0
to disable cmplog
as the plugin does a similar job.
Please refer to the AFL++ manual for building options.
For the symbolic tracing binary, set the following env options
KO_CC=clang-12
: use clang-12 as the C compiler, because SymSan is compiled as a LLVM-12 passKO_CXX=clang++-12
: using clang++-12 as the C++ compilerKO_USE_FASTGEN=1
: using the out-of-process solving mode (i.e., decoupled tracing and solving)KO_DONT_OPTIMIZE=1
(optional): keep the original optimization level, otherwise override with-O3
KO_NO_NATIVE_ZLIB
(optional): if you're using instrumented libzKO_USE_NATIVE_LIBCXX
(optional): if you want to use the native, uninstrumented standard C++ lib, the default option uses the instrumentedlibc++
andlibc++-abi
.
To build the target, set CC=/path/to/symsan/bin/ko-clang
and CXX=/path/to/symsan/bin/ko-clang++
.
If the configuration fails, you can set KO_CONFIG=1
, and unset it after configuration.
After the two binaries are built, use AFL++ to fuzz it, use the following env options to load the plugin and control its behavior:
AFL_CUSTOM_MUTATOR_LIBRARY=/path/to/symsan/bin/libSymSanMutator.so
: load the pluginSYMSAN_TARGET=/path/to/symsan-instrumented-binary
: symbolic tracing binaryAFL_DISABLE_TRIM=1
(optional): for some targets (e.g., themini
test case), you may want to disable trimAFL_CUSTOM_MUTATOR_ONLY=1
(optional): if you only want to test the pluginSYMSAN_OUTPUT_DIR=/none/default/dir
(optional): a different directory to store temporary outputs from SymSanSYMSAN_USE_JIGSAW=1
(optional): use JIGSAW as the solverSYMSAN_USE_Z3=1
(optional): use Z3 as the solverSYMSAN_USE_NESTED=1
(optional): consider nested branches when constructing a solving task
The custom mutator works in two main steps:
-
In the interface function
afl_custom_fuzz_count
, the plugin spawns a symsan-instrumented binary to collect the symbolic traces (i.e., the tracing stage in libafl). For each event it wants to handle, it constructs a solving task. -
In the interface function
afl_custom_fuzz
, the plugin fetches a solving task, solves it, and generate a new input (i.e., the mutation stage in libafl). The newly generated input is then evaluated with the main fuzzing binary. If the input is saved, the task is considered as successfully solved.
One main motivation to move to libafl and afl++ custom mutator is to make the concolic execution stage more extensible (than in fastgen). Following are some interfaces that can be customized:
-
rgd::CovManager
is in charge of determine whether an event from symsan should be used to construct a solving task. The default one uses branch coverage (similar to sancovtrace-pc-guard
) to filter events. -
rgd::TaskManager
is in charge of scheduling solving tasks. The default one is a simple FIFO queue. -
rgd::Solver
is in charge of solving a solving task. Right now there are three solvers, which works in a layered manner (i2s->jigsaw->z3): if a task is solved by an earlier solver, it will skip the next solver; otherwise the next solver is invoked.- The default one is a simple I2S solver, which uses tracing results to map input bytes to comparison operands and generate a solution based on potential input-to-state correspondence.
- JIGSAW, which is our JIT-based constraint solver.
- Z3