# Stack Usage Analysis This document provides per-function stack frame sizes for every cfgpack library function, measured with `clang -fstack-usage` on an arm64 target. Use these numbers to budget stack space when deploying cfgpack on embedded systems. ## Recommendations for Embedded Targets 1. **Runtime stack budget**: With `-Os`, all runtime operations (get/set, pageout, pagein) stay under **~500 B** of stack. Budget 512 B for the cfgpack runtime call chain. 2. **Setup stack budget**: Schema parsing needs up to **~1,296 B** at `-Os` for `.map` format. If parsing schemas on the device, budget 1.5 KB. JSON and msgpack parsers are cheaper (~1,120 B and ~752 B respectively). If schemas are parsed on a host and only binary config is loaded, the parser is not linked. 3. **Always compile with `-Os` or `-O2`**: Stack usage drops significantly versus `-O0` due to inlining and register allocation. Many small functions (getters, setters, init) collapse to 0 B frames at `-Os`. 4. **Reduce `CFGPACK_SKIP_MAX_DEPTH`** on very constrained targets. A value of 8 saves 96 B versus the default 32 and is sufficient for typical 1-2 level config maps. 5. **Use `make stack-usage-Os`** to verify stack sizes any time the code changes or compiler flags are adjusted. ## Measurement Method Stack frame sizes were obtained by compiling with: ``` clang -fstack-usage -std=c99 -Wall -Wextra [-O0 | -Os] ``` The `-fstack-usage` flag emits `.su` files alongside each `.o` with the exact stack frame size for every function. Two optimization levels are shown: - **-O0**: Worst case (no inlining, no register reuse). Useful for debug builds. - **-Os**: Optimized for size. Representative of production embedded builds. Run `make stack-usage-O0` or `make stack-usage-Os` to reproduce these measurements. ## Per-Function Stack Frame Sizes ### core.c — Runtime get/set operations | Function | -O0 | -Os | Notes | |---|---:|---:|---| | `cfgpack_init` | 128 | 0 | Inlined at -Os | | `cfgpack_free` | 16 | 0 | Trivial cleanup | | `cfgpack_set` | 64 | 0 | Set value by index | | `cfgpack_get` | 64 | 0 | Get value by index | | `cfgpack_set_by_name` | 64 | 80 | Set by name (calls find_entry_by_name) | | `cfgpack_get_by_name` | 64 | 80 | Get by name | | `cfgpack_set_str` | 96 | 80 | Set string value | | `cfgpack_get_str` | 80 | 0 | Get string value | | `cfgpack_set_fstr` | 96 | 80 | Set fixed-string value | | `cfgpack_get_fstr` | 80 | 0 | Get fixed-string value | | `cfgpack_set_str_by_name` | 64 | 64 | | | `cfgpack_get_str_by_name` | 64 | 64 | | | `cfgpack_set_fstr_by_name` | 64 | 64 | | | `cfgpack_get_fstr_by_name` | 64 | 64 | | | `cfgpack_get_version` | 16 | 0 | | | `cfgpack_get_size` | 48 | 0 | | | `cfgpack_print` | 16 | 0 | No-op in embedded mode | | `cfgpack_print_all` | 16 | 0 | No-op in embedded mode | ### io.c — MessagePack serialization/deserialization | Function | -O0 | -Os | Notes | |---|---:|---:|---| | `cfgpack_pageout` | 128 | 112 | Serialize context to buffer (incl. CRC) | | `cfgpack_pageout_measure` | 128 | 112 | Measure serialized size without encoding | | `cfgpack_peek_name` | 128 | 112 | Read schema name from buffer | | `cfgpack_pagein_buf` | 48 | 0 | Deserialize from buffer | | `cfgpack_pagein_remap` | 176 | 160 | Deserialize with index remapping | | `decode_value` | 128 | 64 | Internal | | `decode_value_with_coercion` | 144 | — | Inlined at -Os | | `encode_value` | 80 | — | Inlined at -Os | ### io_littlefs.c — LittleFS file I/O | Function | -O0 | -Os | Notes | |---|---:|---:|---| | `cfgpack_pageout_lfs` | 112 | 208 | Serialize to LittleFS file | | `cfgpack_pagein_lfs` | 112 | 192 | Deserialize from LittleFS file | | `write_lfs_file` | 224 | — | Internal; inlined at -Os | | `read_lfs_file` | 240 | — | Internal; inlined at -Os | ### msgpack.c — Low-level MessagePack primitives | Function | -O0 | -Os | Notes | |---|---:|---:|---| | `cfgpack_msgpack_skip_value` | 240 | 160 | **Iterative** — bounded at all depths | | `cfgpack_msgpack_encode_uint64` | 80 | 48 | | | `cfgpack_msgpack_encode_int64` | 80 | 48 | | | `cfgpack_msgpack_encode_f32` | 48 | 32 | | | `cfgpack_msgpack_encode_f64` | 80 | 48 | | | `cfgpack_msgpack_encode_str` | 64 | 64 | | | `cfgpack_msgpack_encode_map_header` | 48 | 32 | | | `cfgpack_msgpack_decode_uint64` | 80 | 32 | | | `cfgpack_msgpack_decode_int64` | 96 | 32 | | | `cfgpack_msgpack_decode_f32` | 64 | 0 | | | `cfgpack_msgpack_decode_f64` | 80 | 32 | | | `cfgpack_msgpack_decode_str` | 64 | 0 | | | `cfgpack_msgpack_decode_map_header` | 48 | 0 | | ### schema_parser.c — Schema parsing (setup-time only) Public API functions are thin wrappers that delegate to shared `_impl` functions. At `-Os`, most wrappers are inlined to 0 B. The `_impl` functions hold the real stack cost and are included below. | Function | -O0 | -Os | Notes | |---|---:|---:|---| | `cfgpack_parse_schema` | 48 | 0 | Wrapper → `parse_schema_map_impl` | | `parse_schema_map_impl` | 592 | 1200 | `map_phase2` inlined at -Os | | `cfgpack_schema_measure` | 64 | 0 | Wrapper → `parse_schema_map_impl` | | `cfgpack_schema_parse_json` | 48 | 0 | Wrapper → `parse_schema_json_impl` | | `parse_schema_json_impl` | 288 | 544 | JSON orchestrator | | `json_phase2` | 592 | 416 | JSON string default extraction | | `cfgpack_schema_measure_json` | 64 | 0 | Wrapper → `parse_schema_json_impl` | | `cfgpack_schema_parse_msgpack` | 48 | 0 | Wrapper → `parse_schema_msgpack_impl` | | `parse_schema_msgpack_impl` | 240 | 336 | Msgpack orchestrator | | `mp_phase2` | 320 | 352 | Msgpack default extraction | | `cfgpack_schema_measure_msgpack` | 64 | 0 | Wrapper → `parse_schema_msgpack_impl` | | `cfgpack_schema_write_json` | 144 | 144 | JSON schema writer | | `cfgpack_schema_write_msgpack` | 176 | 128 | Msgpack schema writer | | `cfgpack_schema_get_sizing` | 64 | 0 | | | `cfgpack_schema_free` | 16 | 0 | | ### decompress.c — Decompression wrappers | Function | -O0 | -Os | Notes | |---|---:|---:|---| | `cfgpack_pagein_lz4` | 80 | 48 | LZ4 decompress + pagein | | `cfgpack_pagein_heatshrink` | 112 | 96 | Heatshrink decompress + pagein | ### tokens.c — String tokenizer (internal) | Function | -O0 | -Os | Notes | |---|---:|---:|---| | `tokens_create` | 48 | 32 | | | `tokens_find` | 96 | 96 | | | `tokens_destroy` | 32 | 32 | | ## Worst-Case Call Chain Depths These are the maximum total stack usage for public API entry points, computed by summing frame sizes along the deepest call chain. ### Runtime operations (called frequently) | API Call | -O0 worst case | -Os worst case | |---|---:|---:| | `cfgpack_set` / `cfgpack_get` | ~128 B | ~0 B | | `cfgpack_set_by_name` / `cfgpack_get_by_name` | ~128 B | ~80 B | | `cfgpack_set_str` / `cfgpack_set_fstr` | ~160 B | ~80 B | | `cfgpack_pageout` | ~272 B | ~176 B | | `cfgpack_pagein_buf` | ~544 B | ~320 B | | `cfgpack_pagein_remap` | ~544 B | ~320 B | | `cfgpack_pagein_lz4` | ~624 B | ~368 B | | `cfgpack_pagein_heatshrink` | ~656 B | ~416 B | | `cfgpack_pageout_lfs` | ~464 B | ~384 B | | `cfgpack_pagein_lfs` | ~656 B | ~512 B | ### Setup operations (called once at startup) | API Call | -O0 worst case | -Os worst case | |---|---:|---:| | `cfgpack_init` | ~128 B | ~0 B | | `cfgpack_parse_schema` | ~1,456 B | ~1,296 B | | `cfgpack_schema_parse_json` | ~1,136 B | ~1,120 B | | `cfgpack_schema_parse_msgpack` | ~688 B | ~752 B | | `cfgpack_schema_measure` | ~736 B | ~1,296 B | | `cfgpack_schema_measure_json` | ~560 B | ~704 B | | `cfgpack_schema_measure_msgpack` | ~464 B | ~496 B | **Note on measure functions at -Os**: The `_impl` functions are shared between parse and measure paths. At `-Os`, `map_phase2` is inlined into `parse_schema_map_impl`, inflating the frame to 1,200 B even though the measure path never executes the phase 2 code. The compiler reserves stack for all locals regardless of which branch is taken. ## Configuration Knobs Affecting Stack Usage ### `CFGPACK_MAX_ENTRIES` (default: 128) Defined in `include/cfgpack/config.h`. Controls the size of the inline presence bitmap in `cfgpack_ctx_t`: ``` CFGPACK_PRESENCE_BYTES = ceil(CFGPACK_MAX_ENTRIES / 8) ``` At default 128 entries: 16 bytes in the context struct. This does not directly affect stack usage (the bitmap is in the context, not on the stack), but reducing it shrinks the context structure. ### `CFGPACK_SKIP_MAX_DEPTH` (default: 32) Defined in `include/cfgpack/config.h`. Controls the maximum nesting depth that `cfgpack_msgpack_skip_value()` can handle. Each level costs 4 bytes (one `uint32_t` counter), so: | CFGPACK_SKIP_MAX_DEPTH | Stack cost | |---:|---:| | 8 | 32 B | | 16 | 64 B | | 32 (default) | 128 B | | 64 | 256 B | For typical configuration data with 1-2 levels of nesting, a depth of 8 is sufficient. Reduce this on very constrained targets. Override at compile time: ``` -DCFGPACK_SKIP_MAX_DEPTH=8 ``` ### `MAX_LINE_LEN` (default: 256, internal to schema_parser.c) Controls the maximum line length for `.map` schema files. A buffer of this size exists on the stack in `parse_schema_map_impl()`. Reducing to 128 would save ~128 B, but this is setup-time code and not called at runtime.