Getting Started
This section contains instructions to build CCF on your system.
Go to the next page to start by installing required dependencies.
Prerequisites
This page describes how to install the prerequisites required to build the project on Ubuntu (20.04 LTS) and Arch Linux.
Ubuntu (22.04 LTS)
The packages given here were tested under Ubuntu 22.04 LTS, but could also work on later versions of Ubuntu, possibly requiring adjustments.
Install the following required packages:
# apt install git pipx cmake ccache rsync ninja-build libgmp-dev autoconf python3.11
Pipenv
It seems there is a packaging error in Ubuntu 22.04 that basically makes the pipenv
package useless. Instead, install pipenv using pipx:
# pipx install pipenv
Please ensure that pipenv
is in your PATH
afterwards (you may get a warning to run pipx ensurepath
).
Rustup
Rustup, a toolchain manager for the Rust programming language, is also required to build the project, but not available in the Ubuntu 20.04 repositories.
The official installation instructions for rustup can be found under https://rustup.rs/. We will assume rustup to be installed in the rest of this guide.
Please ensure that cargo
is in your PATH
afterwards (you may get a warning to run source "$HOME/.cargo/env"
).
Arch Linux
The following packages are required to build the project:
# pacman -S git openssh python python-pipenv base-devel cmake ninja ccache rsync rustup
Building CCF
Once the prerequisites have been installed, we can fetch and build the project. We assume that you have access to a git repository from which you can pull the workspace.
$ git clone <git repo link> workspace
$ cd workspace
$ ./ws build
Running ./ws build
can take some time when first run, as it will fetch all sources and build them, which includes, e.g., building a version of LLVM.
Subsequent usages of ./ws build
will run incrementally, only recompiling what needs to be recompiled depending on changes that were made (if any).
Also see the README.md
file in the repository, it gives additional usage instructions for the ./ws
command.
Hello World
In this section we will compile a simple program for concolic execution and run it.
Entering a workspace shell
First we have to make all the executables built by the workspace available in our shell. We can do so by entering our workspace directory and running:
$ ./ws shell
In this shell we can now use all the tools that the workspace built. To check if everything works, we can try to run of the executables, for example:
$ musl-clang --help
should output a help message.
musl-clang
is the compiler that can build programs with concolic instrumentation.
We will next use it to build a simple program.
Building a simple program
Create a new file hello-world.c
and paste the following code into it:
#include <stdio.h>
int main(void) {
printf("Hello World!\n");
return 0;
}
Currently compilation and linking for concolically instrumented binaries have to be performed in two steps. Therefore we have to run:
$ musl-clang --instrument -c hello-world.c -o hello-world.o
$ musl-clang -o hello-world hello-world.o
Which will create the binary hello-world
, which is ready for concolic execution.
Running the example program
A binary compiled for concolic execution cannot be executed as-is, as it expects a specific environment during execution.
Therefore, simply running ./hello-world
will output an error message instead of the text we expected.
Instead, we can run our binary in a basic concolic environment using env_tool
:
$ env_tool run-concolic ./hello-world
Which will now print the expected "Hello World!" to our terminal.
env_tool run-concolic
is a basic way to execute a binary that was compiled for concolic execution without running the full concolic execution itself.
Conclusion
In this section we compiled and executed a simple program for concolic execution, but did not yet run the full concolic execution engine.
Testing a Program
In this section, we will exhaustively explore a simple program by using concolic execution and (distributed) hybrid concolic fuzzing.
Building a Target
Before we can start testing a program, we need a suitable candidate for exploration. Here, we create a new file crash_check.c
and add the following code:
#include <unistd.h>
#include <sys/types.h>
#include <fcntl.h>
#include <string.h>
#include <stdlib.h>
int main(void) {
char inp[6];
inp[5] = '\0';
if (read(STDIN_FILENO, &inp, 5) != 5) {
return 1;
}
if (strcmp(inp, "crash") == 0) {
abort();
}
return 0;
}
This program reads 5 bytes of input from stdin
and writes them to a pre-allocated buffer. It then checks whether the given input is equal to "crash"
and, if true, crashes by calling abort()
.
We then enter our workspace shell and compile the program as described in a prior section.
Configuring a Worker
Before we can start exploring our target program, we have to configure the worker that performs the exploration. The following command:
$ worker --create-config-file
creates a new config file worker_config.toml
in the current directory. The file contains various settings for changing the behavior of the worker but for now, the only interesting setting is concex_target
which should point to the executable of the program under test. For our newly created target program crash_check
:
concex_target = "./crash_check"
Running the Worker
Now that we have configured the worker, we can start the exploration of the program by running:
$ worker
The worker should now run for a few seconds and then inform us that it has finished exploring our program under test and terminate. By default, the output of the worker is stored in worker_out
in the same directory that the worker was started from. To find the inputs that the worker has generated, we use
$ cd worker_out/sync_dir/concolic/queue
Here, we see multiple files of the form id:XXXXX
that denote inputs that took different paths through the program. One of these inputs should have an additional crash
in its name, denoting that this input crashed the program under test. If we inspect the contents of this file, we see that it starts with "crash"
followed by random bytes which crashes our test program.
Adding a Fuzzer
In addition to concolic execution, we can use fuzzing to explore the program under test. The engine supports AFL as its fuzzer. To use fuzzing, we first have to compile the program with the instrumentation of the fuzzer:
$ <PATH TO AFL>/afl-clang -o crash_check_fuzzer crash_check.c
Then we have to change three settings in the worker_config.toml
:
fuzzer = <path to AFL>/afl-fuzz
num_fuzzers = 1
fuzzer_target = "./crash_check_fuzzer"
This tells the worker where it can find the executable of the fuzzer, how many fuzzer instances should run in parallel and which executable is instrumented for use with the fuzzer.
Afterward, we can start the worker:
$ worker
After the worker has terminated, worker_out/sync_dir
contains an additional folder which contains the output of the fuzzer.
Distributed Hybrid Concolic Fuzzing
When testing larger software it might be advantageous to use multiple worker instances on multiple machines. For simplicity, we will start two worker instances on the same machine but the procedure is the same over a network.
⚠️ Depending on your network settings, the default addresses used by the workers and the master might have to be changed.
First, open worker_config.toml
and set
local = false
to instruct the worker to connect to a master.
Then copy worker_config.toml
to worker2_config.toml
and set
output_dir = "worker2_out"
to create a configuration file for the second worker that does not conflict with the first worker.
Next, we have to create a configuration for the central coordinator that synchronizes the exploration of both workers. Inside a workspace shell type
master --create-config-file
which creates a file named master_config.toml
.
In the config file set
fuzzer_target = "./crash_check_fuzzer"
concex_target = "./crash_check"
update_interval = "1"
Changing fuzzer_target
and concex_target
is necessary because in a distributed setting, the target binaries are sent from the master to all workers. Changing the update interval to one second is not necessary but it allows us to see the workers communicating with the master more quickly.
Now open three terminals and enter the workspace shell in each one. In the first terminal start the first worker
$ worker --config-file ./worker_config.toml
and the second worker in the second terminal
$ worker --config-file ./worker2_config.toml
Since we set local = false
, both workers will not start exploring the program but, instead, try to connect to the master.
In the third terminal, start the master by typing
$ master
The master should now output that two workers have connected and the workers begin exploring the program.
In addition to the terminal output, the master also provides a web interface to control the workers and to check the status of the exploration. By default, the web interface is reachable under the address 127.0.0.1:8888
.
Once all workers report that 0 candidates are left for exploration, the master can be shut down, either via the web interface, or by pressing CTRL-C
inside the terminal. When the master shuts down, it automatically shuts down all connected workers.
After the exploration, master_out
contains a folder with all crashes that were found and a JSON file that contains all received status updates from the workers.
Conclusion
In this section we explored a simple program and found a crashing input by using concolic execution and fuzzing. Furthermore, we ran multiple worker instances in parallel by using a central master instance.
Advanced Notes
This chapter contains notes for advanced users of ccf
.
It might make sense to read through these when (seriously) using ccf
, but hopefully the information given here is not relevant for most users.
Subtle Interactions
This page contains information on subtle interactions that are part of ccf
.
Environment Isolation
Environment isolation can include some subtle and surprising behavior.
Different dynamic loaders between AFL and concolic execution
When using AFL's forkserver (as we do), the forkserver is launched outside of the isolated environment, and thus, the dynamic loader of the host system will be used to load the binary. When using concolic execution, the binary is launched inside the isolated environment, and will thus look for the dynamic loader inside this environment.
This also means that running a binary that is not completely static with concolic execution and environment isolation requires a dynamic loader inside the isolated environment, as well as all dynamic libraries that the binary requires.
Otherwise, a "file not found"
error will occur, which can be confusing, given that the binary itself exists, but this is due to either the dynamic loader or a required dynamic library missing.