Fuzz Testing
CFGPack includes six libFuzzer harnesses that exercise the parsers and decode paths with randomized input. All harnesses are compiled with AddressSanitizer (ASan) and UndefinedBehaviorSanitizer (UBSan) to catch memory errors and undefined behavior at runtime.
Why Fuzz the Parsers?
CFGPack’s schema parsers (.map, JSON, and MessagePack binary) accept external input that may be corrupted, truncated, or adversarial — especially when loading configuration from flash storage or receiving it over a network. The cfgpack_pagein_buf() path also deserializes untrusted MessagePack data. Fuzzing these entry points provides confidence that malformed input is rejected cleanly rather than triggering buffer overflows, out-of-bounds reads, or undefined behavior.
Fuzz Targets
Six harness files live in tests/fuzz/:
Harness |
Source |
What it exercises |
|---|---|---|
|
|
|
|
|
JSON schema parser ( |
|
|
MessagePack binary schema parser ( |
|
|
Structure-aware msgpack schema fuzzer using |
|
|
|
|
|
All low-level msgpack decode functions ( |
Each harness implements libFuzzer’s LLVMFuzzerTestOneInput entry point, allocates a stack-based cfgpack_ctx_t, and feeds the fuzzer-provided data directly to the target function. The fuzz_parse_msgpack_mutator harness additionally implements LLVMFuzzerCustomMutator to generate structurally valid msgpack schema blobs with 16 corruption modes (truncation, bitflips, wrong counts, type mismatches, duplicate names/indices, etc.), enabling coverage of parser code paths that random bytes alone are unlikely to reach. When parsing succeeds, the harness also initializes a runtime context and performs a cfgpack_pageout/cfgpack_pagein_buf roundtrip, exercising the encode and decode I/O paths with fuzzer-derived schema data. All harnesses are self-contained and do not use the heap.
Prerequisites
Linux
Any recent Clang (11+) ships libFuzzer. No extra setup needed.
sudo apt install clang # Debian/Ubuntu
macOS
Apple Clang does not ship libFuzzer. Install the full LLVM toolchain via Homebrew:
brew install llvm
The build system auto-detects Homebrew LLVM when the system clang lacks libFuzzer support. You do not need to manually set CC.
Building
From the project root:
make fuzz
This delegates to the sub-makefile at tests/fuzz/Makefile, which:
Detects whether
CChas libFuzzer support. On macOS, if Apple Clang is detected, it automatically switches to Homebrew LLVM.Builds the seed corpus generator (
gen_seeds) and runs it to populate the corpus directories.Compiles all six fuzz harnesses with
-fsanitize=fuzzer,address,undefined.
Binaries are placed in build/out/:
build/out/fuzz_parse_map
build/out/fuzz_parse_json
build/out/fuzz_parse_msgpack
build/out/fuzz_parse_msgpack_mutator
build/out/fuzz_pagein
build/out/fuzz_msgpack_decode
build/out/gen_seeds
Why harnesses compile sources directly
Fuzz harnesses compile the library source files directly ($(LIBSRC)) rather than linking against libcfgpack.a. This is required because AddressSanitizer and UBSan instrument code at compile time — both the harness and the library code must be compiled with -fsanitize=... flags for the sanitizers to detect issues in library code.
Seed Corpus
The gen_seeds.c program generates valid seed files across six corpus directories:
Directory |
Seeds |
Description |
|---|---|---|
|
1 |
A valid |
|
3 |
Valid JSON schemas (minimal, typical, all types) |
|
1 |
A valid msgpack binary schema |
|
4 |
Small random byte sequences that parameterize the custom mutator |
|
2 |
Valid serialized config blobs (empty + populated) |
|
10 |
Individual msgpack-encoded values (uint, int, float, string, map, etc.) |
Seeds are regenerated automatically every time make fuzz runs (the fuzz target depends on gen-seeds). Starting from valid inputs helps the fuzzer reach deeper code paths faster.
Running
Using the runner script
The scripts/run-fuzz.sh script runs all six targets sequentially with colored output:
scripts/run-fuzz.sh # 60s per target (default)
scripts/run-fuzz.sh 300 # 300s per target
scripts/run-fuzz.sh 0 # run indefinitely (Ctrl-C to stop)
The script sets -max_len=4096 and -print_final_stats=1 for each target. Exit code is non-zero if any target crashes.
Running a single target directly
You can run any harness directly with libFuzzer flags:
build/out/fuzz_parse_msgpack tests/fuzz/corpus_msgpack/ \
-max_total_time=120 \
-max_len=4096 \
-print_final_stats=1
Useful libFuzzer flags:
Flag |
Description |
|---|---|
|
Stop after N seconds (0 = indefinite) |
|
Maximum input size in bytes |
|
Run N fuzzing jobs in parallel |
|
Number of parallel worker processes |
|
Print coverage and execution stats at exit |
|
Save crash files to a directory |
See libFuzzer documentation for the full list.
Investigating Crashes
When libFuzzer finds a crash, it writes a reproducer file (e.g., crash-<hash>) to the current directory (or the path set by -artifact_prefix).
Reproducing a crash
Run the harness with the crash file as an argument (not a directory):
build/out/fuzz_parse_msgpack crash-435c5524aa57e3619dd857148000af58d295e4f4
ASan will print a detailed report showing the crash type (heap-buffer-overflow, stack-buffer-overflow, use-after-free, etc.), the exact source location, and a stack trace.
Debugging with lldb
lldb -- build/out/fuzz_parse_msgpack crash-435c5524aa57e3619dd857148000af58d295e4f4
(lldb) run
ASan stops at the exact point of the memory error. Use bt for a backtrace.
Minimizing a crash input
libFuzzer can shrink a crash reproducer to its minimal triggering input:
build/out/fuzz_parse_msgpack -minimize_crash=1 -max_total_time=60 crash-<hash>
Architecture
Sub-makefile design
Fuzz build logic lives in tests/fuzz/Makefile, invoked by the root makefile via:
fuzz:
@$(MAKE) -C tests/fuzz fuzz ROOT=$(CURDIR) BUILD=$(CURDIR)/$(BUILD) OUT=$(CURDIR)/$(OUT) CC=$(CC)
This keeps the default make / make tests path free of fuzz-related overhead (no Homebrew detection, no LLVM checks). The sub-makefile receives absolute paths for ROOT, BUILD, and OUT so all paths resolve correctly from its working directory (tests/fuzz/).
On macOS, the sub-makefile uses override CC to replace Apple Clang with Homebrew LLVM. The override is necessary because the parent passes CC=clang on the command line, which takes precedence over regular variable assignments in the sub-makefile.