Getting Started

This section contains instructions to build CCF on your system.

Go to the next page to start by installing required dependencies.

Prerequisites

This page describes how to install the prerequisites required to build the project on Ubuntu (20.04 LTS) and Arch Linux.

Ubuntu (22.04 LTS)

The packages given here were tested under Ubuntu 22.04 LTS, but could also work on later versions of Ubuntu, possibly requiring adjustments.

Install the following required packages:

# apt install git pipx cmake ccache rsync ninja-build libgmp-dev autoconf python3.11

Pipenv

It seems there is a packaging error in Ubuntu 22.04 that basically makes the pipenv package useless. Instead, install pipenv using pipx:

# pipx install pipenv

Please ensure that pipenv is in your PATH afterwards (you may get a warning to run pipx ensurepath).

Rustup

Rustup, a toolchain manager for the Rust programming language, is also required to build the project, but not available in the Ubuntu 20.04 repositories.

The official installation instructions for rustup can be found under https://rustup.rs/. We will assume rustup to be installed in the rest of this guide.

Please ensure that cargo is in your PATH afterwards (you may get a warning to run source "$HOME/.cargo/env").

Arch Linux

The following packages are required to build the project:

# pacman -S git openssh python python-pipenv base-devel cmake ninja ccache rsync rustup

Building CCF

Once the prerequisites have been installed, we can fetch and build the project. We assume that you have access to a git repository from which you can pull the workspace.

$ git clone <git repo link> workspace
$ cd workspace
$ ./ws build

Running ./ws build can take some time when first run, as it will fetch all sources and build them, which includes, e.g., building a version of LLVM. Subsequent usages of ./ws build will run incrementally, only recompiling what needs to be recompiled depending on changes that were made (if any). Also see the README.md file in the repository, it gives additional usage instructions for the ./ws command.

Hello World

In this section we will compile a simple program for concolic execution and run it.

Entering a workspace shell

First we have to make all the executables built by the workspace available in our shell. We can do so by entering our workspace directory and running:

$ ./ws shell

In this shell we can now use all the tools that the workspace built. To check if everything works, we can try to run of the executables, for example:

$ musl-clang --help

should output a help message. musl-clang is the compiler that can build programs with concolic instrumentation. We will next use it to build a simple program.

Building a simple program

Create a new file hello-world.c and paste the following code into it:

#include <stdio.h>

int main(void) {
  printf("Hello World!\n");
  return 0;
}

Currently compilation and linking for concolically instrumented binaries have to be performed in two steps. Therefore we have to run:

$ musl-clang --instrument -c hello-world.c -o hello-world.o
$ musl-clang -o hello-world hello-world.o

Which will create the binary hello-world, which is ready for concolic execution.

Running the example program

A binary compiled for concolic execution cannot be executed as-is, as it expects a specific environment during execution. Therefore, simply running ./hello-world will output an error message instead of the text we expected.

Instead, we can run our binary in a basic concolic environment using env_tool:

$ env_tool run-concolic ./hello-world

Which will now print the expected "Hello World!" to our terminal. env_tool run-concolic is a basic way to execute a binary that was compiled for concolic execution without running the full concolic execution itself.

Conclusion

In this section we compiled and executed a simple program for concolic execution, but did not yet run the full concolic execution engine.

Testing a Program

In this section, we will exhaustively explore a simple program by using concolic execution and (distributed) hybrid concolic fuzzing.

Building a Target

Before we can start testing a program, we need a suitable candidate for exploration. Here, we create a new file crash_check.c and add the following code:

#include <unistd.h>
#include <sys/types.h>
#include <fcntl.h>
#include <string.h>
#include <stdlib.h>

int main(void) {
    char inp[6];
    inp[5] = '\0';

    if (read(STDIN_FILENO, &inp, 5) != 5) {
        return 1;
    }

    if (strcmp(inp, "crash") == 0) {
        abort();
    }
    
    return 0;
}

This program reads 5 bytes of input from stdin and writes them to a pre-allocated buffer. It then checks whether the given input is equal to "crash" and, if true, crashes by calling abort().

We then enter our workspace shell and compile the program as described in a prior section.

Configuring a Worker

Before we can start exploring our target program, we have to configure the worker that performs the exploration. The following command:

$ worker --create-config-file

creates a new config file worker_config.toml in the current directory. The file contains various settings for changing the behavior of the worker but for now, the only interesting setting is concex_target which should point to the executable of the program under test. For our newly created target program crash_check:

concex_target = "./crash_check"

Running the Worker

Now that we have configured the worker, we can start the exploration of the program by running:

$ worker

The worker should now run for a few seconds and then inform us that it has finished exploring our program under test and terminate. By default, the output of the worker is stored in worker_out in the same directory that the worker was started from. To find the inputs that the worker has generated, we use

$ cd worker_out/sync_dir/concolic/queue

Here, we see multiple files of the form id:XXXXX that denote inputs that took different paths through the program. One of these inputs should have an additional crash in its name, denoting that this input crashed the program under test. If we inspect the contents of this file, we see that it starts with "crash" followed by random bytes which crashes our test program.

Adding a Fuzzer

In addition to concolic execution, we can use fuzzing to explore the program under test. The engine supports AFL as its fuzzer. To use fuzzing, we first have to compile the program with the instrumentation of the fuzzer:

$ <PATH TO AFL>/afl-clang -o crash_check_fuzzer crash_check.c

Then we have to change three settings in the worker_config.toml:

fuzzer = <path to AFL>/afl-fuzz
num_fuzzers = 1
fuzzer_target = "./crash_check_fuzzer"

This tells the worker where it can find the executable of the fuzzer, how many fuzzer instances should run in parallel and which executable is instrumented for use with the fuzzer.

Afterward, we can start the worker:

$ worker

After the worker has terminated, worker_out/sync_dir contains an additional folder which contains the output of the fuzzer.

Distributed Hybrid Concolic Fuzzing

When testing larger software it might be advantageous to use multiple worker instances on multiple machines. For simplicity, we will start two worker instances on the same machine but the procedure is the same over a network.


⚠️ Depending on your network settings, the default addresses used by the workers and the master might have to be changed.


First, open worker_config.toml and set

local = false

to instruct the worker to connect to a master.

Then copy worker_config.toml to worker2_config.toml and set

output_dir = "worker2_out"

to create a configuration file for the second worker that does not conflict with the first worker.

Next, we have to create a configuration for the central coordinator that synchronizes the exploration of both workers. Inside a workspace shell type

master --create-config-file

which creates a file named master_config.toml.

In the config file set

fuzzer_target = "./crash_check_fuzzer"
concex_target = "./crash_check"
update_interval = "1"

Changing fuzzer_target and concex_target is necessary because in a distributed setting, the target binaries are sent from the master to all workers. Changing the update interval to one second is not necessary but it allows us to see the workers communicating with the master more quickly.

Now open three terminals and enter the workspace shell in each one. In the first terminal start the first worker

$ worker --config-file ./worker_config.toml

and the second worker in the second terminal

$ worker --config-file ./worker2_config.toml

Since we set local = false, both workers will not start exploring the program but, instead, try to connect to the master. In the third terminal, start the master by typing

$ master

The master should now output that two workers have connected and the workers begin exploring the program.

In addition to the terminal output, the master also provides a web interface to control the workers and to check the status of the exploration. By default, the web interface is reachable under the address 127.0.0.1:8888.

Once all workers report that 0 candidates are left for exploration, the master can be shut down, either via the web interface, or by pressing CTRL-C inside the terminal. When the master shuts down, it automatically shuts down all connected workers.

After the exploration, master_out contains a folder with all crashes that were found and a JSON file that contains all received status updates from the workers.

Conclusion

In this section we explored a simple program and found a crashing input by using concolic execution and fuzzing. Furthermore, we ran multiple worker instances in parallel by using a central master instance.

Advanced Notes

This chapter contains notes for advanced users of ccf. It might make sense to read through these when (seriously) using ccf, but hopefully the information given here is not relevant for most users.

Subtle Interactions

This page contains information on subtle interactions that are part of ccf.

Environment Isolation

Environment isolation can include some subtle and surprising behavior.

Different dynamic loaders between AFL and concolic execution

When using AFL's forkserver (as we do), the forkserver is launched outside of the isolated environment, and thus, the dynamic loader of the host system will be used to load the binary. When using concolic execution, the binary is launched inside the isolated environment, and will thus look for the dynamic loader inside this environment.

This also means that running a binary that is not completely static with concolic execution and environment isolation requires a dynamic loader inside the isolated environment, as well as all dynamic libraries that the binary requires. Otherwise, a "file not found" error will occur, which can be confusing, given that the binary itself exists, but this is due to either the dynamic loader or a required dynamic library missing.