ecu-tests/docs/23_config_loader_internals.md
Hosam-Eldin Mostafa 032866bba0 refactor(config): convert config.py to package + detailed loader docs
- Replace ecu_framework/config.py with ecu_framework/config/ package
  (loader.py + __init__.py re-exports). Public surface unchanged — every
  call site already uses 'from ecu_framework.config import ...' which
  works identically for a module and a package. Brings config into the
  same shape as lin/, power/, flashing/.
- Enrich loader.py with module-level design notes (pipeline diagram,
  precedence rationale, "known wart" callout) and inline "why" comments:
  the EcuTestConfig forward-reference quirk, the int(k, 0) hex-key trick,
  _deep_update's mutate-in-place semantics, and the reason the in-memory
  overrides are applied last despite being precedence #1.
- Add docs/23_config_loader_internals.md covering the merge semantics,
  type-coercion philosophy, dataclass ordering quirks, PSU side-channel,
  and the test-surface checklist (four places to touch when adding a
  new config field).
- Fix the now-stale ecu_framework/config.py path in 01_run_sequence.md
  and DEVELOPER_COMMIT_GUIDE.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 19:42:35 +02:00

7.9 KiB

Configuration Loader Internals

This document explains how the configuration loader is implemented. For the user-facing "what can I configure and where does it come from" perspective, see 02_configuration_resolution.md. The two are companions: 02 answers "what do I write in YAML?", this file answers "what does the loader do with what I wrote?".

File: ecu_framework/config/loader.py

Pipeline at a glance

    defaults (dict)
        └─▶ merge YAML at $ECU_TESTS_CONFIG         (if env set & file exists)
                └─▶ merge YAML at <workspace>/config/test_config.yaml (if exists)
                        └─▶ merge PSU side-channel (env OWON_PSU_CONFIG or
                            <workspace>/config/owon_psu.yaml)
                                └─▶ merge in-memory overrides (caller-supplied)
                                        └─▶ coerce types & build EcuTestConfig

Two layers run sequentially:

  1. Dict layer — every source contributes a plain dict. They are merged with _deep_update so nested sections combine key-by-key.
  2. Dataclass layer — once merged, _to_dataclass casts the values to their declared types and constructs EcuTestConfig. This is the boundary at which YAML's type fuzziness stops.

Keeping the merge in the dict layer (rather than merging dataclasses) makes the precedence story trivial: it's just a sequence of writes into one dict, and the last writer wins.

Precedence — and why it reads "backwards"

The load_config docstring lists precedence highest-to-lowest:

Rank Source Where in code
1 (highest) overrides dict passed to load_config Applied last
2 YAML at $ECU_TESTS_CONFIG Applied if env points at an existing file
3 YAML at <workspace>/config/test_config.yaml Fallback when env unset
4 (lowest) Built-in defaults The starting base dict

In the implementation, sources are applied in reverse order of that table (lowest → highest). That's exactly what "highest precedence" means here: each merge step overwrites earlier values for the same key, so the last writer wins. The "1) ... 4)" comments inside load_config annotate by precedence rank, not by call order.

_deep_update — the merge semantics

def _deep_update(base, updates):
    for k, v in updates.items():
        if isinstance(v, dict) and isinstance(base.get(k), dict):
            base[k] = _deep_update(base[k], v)
        else:
            base[k] = v
    return base

Rules:

  • Dict-on-both-sides → recurse, so nested overlays don't clobber siblings. This is what lets a YAML file override just interface.bitrate without re-stating the rest of the interface block.
  • Anything else (scalar, list, mismatched types) → replace wholesale.
  • Lists are replaced, not concatenated. This is deliberate: list-concat semantics surprise users who expect "set this list to X" to mean exactly that. If concatenation is ever needed for a specific field, do it explicitly at the call site, not in the merge primitive.
  • Mutation happens in place; the return value is the same base object, returned for chaining convenience (used when merging the PSU side-channel).

_to_dataclass — defensive type coercion

YAML's type inference is generous: "19200" (quoted) comes through as a string, "true" is not a bool, and hex-keyed mappings may arrive as either int or string keys depending on the YAML writer. Rather than propagate that fuzziness, the loader casts at the dataclass boundary:

type=str(iface.get("type", "mock")).lower(),
channel=int(iface.get("channel", 1)),
bitrate=int(iface.get("bitrate", 19200)),
...

Casts that fail raise — and that's the right behavior. A config value that can't be interpreted is a bug to surface early, not silently fall back from.

Special-case: frame_lengths keys

frame_lengths maps a LIN frame ID (int) to a payload length (int). YAML can write the key as a hex int (0x0A), a decimal int (10), or a quoted string ("0x0A"). Coercion handles all three:

key = int(k, 0) if isinstance(k, str) else int(k)

int(k, 0) with base 0 means "infer from prefix" — "0x0A" parses as hex, "10" as decimal. Entries that fail to parse are skipped silently rather than aborting the whole load, because one typo in a frame-length map shouldn't prevent the rest of the configuration from coming up.

PSU side-channel

Power-supply settings (COM port, baudrate, IDN substring) are typically bench-specific and shouldn't be committed alongside test config. The loader honors a dedicated overlay file just for the power_supply section:

  • $OWON_PSU_CONFIG (env var → path) wins, else
  • <workspace>/config/owon_psu.yaml if it exists.

This file is deep-merged into the existing power_supply block, so the main YAML can still provide defaults (e.g. idn_substr: OWON) while the bench file overrides only the parts that vary by machine. Recommended workflow:

config/test_config.yaml   # committed; common defaults
config/owon_psu.yaml      # gitignored; per-bench serial settings

Dataclass schema quirks

Forward reference: EcuTestConfig.power_supply

@dataclass
class EcuTestConfig:
    ...
    power_supply: "PowerSupplyConfig" = field(default_factory=lambda: PowerSupplyConfig())

@dataclass
class PowerSupplyConfig:
    ...

PowerSupplyConfig is referenced before it is defined. This works because:

  1. from __future__ import annotations (PEP 563) turns all type annotations into strings at module load time, so "PowerSupplyConfig" as an annotation never triggers a name lookup.
  2. The default_factory is a lambda, which defers evaluation of the bare name PowerSupplyConfig until EcuTestConfig() is actually instantiated — by which point the module body has finished executing and the name is bound.

The ordering is intentional: EcuTestConfig is the most-used type, so it lives near the top of the file where readers find it first. If you ever drop the from __future__ import annotations line, this ordering breaks; the lambda default would still work, but the string annotation would need updating.

Mutable defaults must use default_factory

field(default_factory=dict) (and default_factory=InterfaceConfig, default_factory=lambda: PowerSupplyConfig()) is required because Python shares default values across instances by default. Using field(default={}) on a dataclass field is a ValueError at class-creation time — the default_factory form is the only correct way.

Known wart: defaults live in two places

The defaults for every field exist twice:

  1. As dataclass field defaults — e.g. type: str = "mock" on InterfaceConfig.
  2. As entries in the base dict inside load_config.

Both must agree, and a drift between them would be silently wrong (the loader's defaults would win for the YAML path, while the dataclass defaults would win for callers that construct InterfaceConfig() directly).

Why it's still this way: the dict is needed because _deep_update operates on dicts; the dataclass defaults are needed because callers may construct configs directly without going through load_config. If a third construction path appears, extract defaults to a single DEFAULTS mapping that both layers read from.

Test surface

Unit tests live in tests/unit/test_config_loader.py. They cover the override precedence chain and the dataclass-construction defaults. When adding a new field, add at minimum:

  1. The dataclass field with a default.
  2. The matching default in the base dict in load_config.
  3. The matching cast line in _to_dataclass.
  4. A unit test asserting it round-trips through load_config(overrides=...).

Skipping (3) is the most common bug — the field will appear to work because the dataclass default carries it, but YAML/env overlays for that field will be silently dropped.