Stack Usage Analysis

This document provides per-function stack frame sizes for every cfgpack library function, measured with clang -fstack-usage on an arm64 target. Use these numbers to budget stack space when deploying cfgpack on embedded systems.

Recommendations for Embedded Targets

  1. Runtime stack budget: With -Os, all runtime operations (get/set, pageout, pagein) stay under ~500 B of stack. Budget 512 B for the cfgpack runtime call chain.

  2. Setup stack budget: Schema parsing needs up to ~1,296 B at -Os for .map format. If parsing schemas on the device, budget 1.5 KB. JSON and msgpack parsers are cheaper (~1,120 B and ~752 B respectively). If schemas are parsed on a host and only binary config is loaded, the parser is not linked.

  3. Always compile with -Os or -O2: Stack usage drops significantly versus -O0 due to inlining and register allocation. Many small functions (getters, setters, init) collapse to 0 B frames at -Os.

  4. Reduce CFGPACK_SKIP_MAX_DEPTH on very constrained targets. A value of 8 saves 96 B versus the default 32 and is sufficient for typical 1-2 level config maps.

  5. Use make stack-usage-Os to verify stack sizes any time the code changes or compiler flags are adjusted.

Measurement Method

Stack frame sizes were obtained by compiling with:

clang -fstack-usage -std=c99 -Wall -Wextra [-O0 | -Os]

The -fstack-usage flag emits .su files alongside each .o with the exact stack frame size for every function. Two optimization levels are shown:

  • -O0: Worst case (no inlining, no register reuse). Useful for debug builds.

  • -Os: Optimized for size. Representative of production embedded builds.

Run make stack-usage-O0 or make stack-usage-Os to reproduce these measurements.

Per-Function Stack Frame Sizes

core.c — Runtime get/set operations

Function

-O0

-Os

Notes

cfgpack_init

128

0

Inlined at -Os

cfgpack_free

16

0

Trivial cleanup

cfgpack_set

64

0

Set value by index

cfgpack_get

64

0

Get value by index

cfgpack_set_by_name

64

80

Set by name (calls find_entry_by_name)

cfgpack_get_by_name

64

80

Get by name

cfgpack_set_str

96

80

Set string value

cfgpack_get_str

80

0

Get string value

cfgpack_set_fstr

96

80

Set fixed-string value

cfgpack_get_fstr

80

0

Get fixed-string value

cfgpack_set_str_by_name

64

64

cfgpack_get_str_by_name

64

64

cfgpack_set_fstr_by_name

64

64

cfgpack_get_fstr_by_name

64

64

cfgpack_get_version

16

0

cfgpack_get_size

48

0

cfgpack_print

16

0

No-op in embedded mode

cfgpack_print_all

16

0

No-op in embedded mode

io.c — MessagePack serialization/deserialization

Function

-O0

-Os

Notes

cfgpack_pageout

128

112

Serialize context to buffer (incl. CRC)

cfgpack_pageout_measure

128

112

Measure serialized size without encoding

cfgpack_peek_name

128

112

Read schema name from buffer

cfgpack_pagein_buf

48

0

Deserialize from buffer

cfgpack_pagein_remap

176

160

Deserialize with index remapping

decode_value

128

64

Internal

decode_value_with_coercion

144

Inlined at -Os

encode_value

80

Inlined at -Os

io_littlefs.c — LittleFS file I/O

Function

-O0

-Os

Notes

cfgpack_pageout_lfs

112

208

Serialize to LittleFS file

cfgpack_pagein_lfs

112

192

Deserialize from LittleFS file

write_lfs_file

224

Internal; inlined at -Os

read_lfs_file

240

Internal; inlined at -Os

msgpack.c — Low-level MessagePack primitives

Function

-O0

-Os

Notes

cfgpack_msgpack_skip_value

240

160

Iterative — bounded at all depths

cfgpack_msgpack_encode_uint64

80

48

cfgpack_msgpack_encode_int64

80

48

cfgpack_msgpack_encode_f32

48

32

cfgpack_msgpack_encode_f64

80

48

cfgpack_msgpack_encode_str

64

64

cfgpack_msgpack_encode_map_header

48

32

cfgpack_msgpack_decode_uint64

80

32

cfgpack_msgpack_decode_int64

96

32

cfgpack_msgpack_decode_f32

64

0

cfgpack_msgpack_decode_f64

80

32

cfgpack_msgpack_decode_str

64

0

cfgpack_msgpack_decode_map_header

48

0

schema_parser.c — Schema parsing (setup-time only)

Public API functions are thin wrappers that delegate to shared _impl functions. At -Os, most wrappers are inlined to 0 B. The _impl functions hold the real stack cost and are included below.

Function

-O0

-Os

Notes

cfgpack_parse_schema

48

0

Wrapper → parse_schema_map_impl

parse_schema_map_impl

592

1200

map_phase2 inlined at -Os

cfgpack_schema_measure

64

0

Wrapper → parse_schema_map_impl

cfgpack_schema_parse_json

48

0

Wrapper → parse_schema_json_impl

parse_schema_json_impl

288

544

JSON orchestrator

json_phase2

592

416

JSON string default extraction

cfgpack_schema_measure_json

64

0

Wrapper → parse_schema_json_impl

cfgpack_schema_parse_msgpack

48

0

Wrapper → parse_schema_msgpack_impl

parse_schema_msgpack_impl

240

336

Msgpack orchestrator

mp_phase2

320

352

Msgpack default extraction

cfgpack_schema_measure_msgpack

64

0

Wrapper → parse_schema_msgpack_impl

cfgpack_schema_write_json

144

144

JSON schema writer

cfgpack_schema_write_msgpack

176

128

Msgpack schema writer

cfgpack_schema_get_sizing

64

0

cfgpack_schema_free

16

0

decompress.c — Decompression wrappers

Function

-O0

-Os

Notes

cfgpack_pagein_lz4

80

48

LZ4 decompress + pagein

cfgpack_pagein_heatshrink

112

96

Heatshrink decompress + pagein

tokens.c — String tokenizer (internal)

Function

-O0

-Os

Notes

tokens_create

48

32

tokens_find

96

96

tokens_destroy

32

32

Worst-Case Call Chain Depths

These are the maximum total stack usage for public API entry points, computed by summing frame sizes along the deepest call chain.

Runtime operations (called frequently)

API Call

-O0 worst case

-Os worst case

cfgpack_set / cfgpack_get

~128 B

~0 B

cfgpack_set_by_name / cfgpack_get_by_name

~128 B

~80 B

cfgpack_set_str / cfgpack_set_fstr

~160 B

~80 B

cfgpack_pageout

~272 B

~176 B

cfgpack_pagein_buf

~544 B

~320 B

cfgpack_pagein_remap

~544 B

~320 B

cfgpack_pagein_lz4

~624 B

~368 B

cfgpack_pagein_heatshrink

~656 B

~416 B

cfgpack_pageout_lfs

~464 B

~384 B

cfgpack_pagein_lfs

~656 B

~512 B

Setup operations (called once at startup)

API Call

-O0 worst case

-Os worst case

cfgpack_init

~128 B

~0 B

cfgpack_parse_schema

~1,456 B

~1,296 B

cfgpack_schema_parse_json

~1,136 B

~1,120 B

cfgpack_schema_parse_msgpack

~688 B

~752 B

cfgpack_schema_measure

~736 B

~1,296 B

cfgpack_schema_measure_json

~560 B

~704 B

cfgpack_schema_measure_msgpack

~464 B

~496 B

Note on measure functions at -Os: The _impl functions are shared between parse and measure paths. At -Os, map_phase2 is inlined into parse_schema_map_impl, inflating the frame to 1,200 B even though the measure path never executes the phase 2 code. The compiler reserves stack for all locals regardless of which branch is taken.

Configuration Knobs Affecting Stack Usage

CFGPACK_MAX_ENTRIES (default: 128)

Defined in include/cfgpack/config.h. Controls the size of the inline presence bitmap in cfgpack_ctx_t:

CFGPACK_PRESENCE_BYTES = ceil(CFGPACK_MAX_ENTRIES / 8)

At default 128 entries: 16 bytes in the context struct. This does not directly affect stack usage (the bitmap is in the context, not on the stack), but reducing it shrinks the context structure.

CFGPACK_SKIP_MAX_DEPTH (default: 32)

Defined in include/cfgpack/config.h. Controls the maximum nesting depth that cfgpack_msgpack_skip_value() can handle. Each level costs 4 bytes (one uint32_t counter), so:

CFGPACK_SKIP_MAX_DEPTH

Stack cost

8

32 B

16

64 B

32 (default)

128 B

64

256 B

For typical configuration data with 1-2 levels of nesting, a depth of 8 is sufficient. Reduce this on very constrained targets.

Override at compile time:

-DCFGPACK_SKIP_MAX_DEPTH=8

MAX_LINE_LEN (default: 256, internal to schema_parser.c)

Controls the maximum line length for .map schema files. A buffer of this size exists on the stack in parse_schema_map_impl(). Reducing to 128 would save ~128 B, but this is setup-time code and not called at runtime.