ecu-tests/docs/23_config_loader_internals.md
Hosam-Eldin Mostafa 032866bba0 refactor(config): convert config.py to package + detailed loader docs
- Replace ecu_framework/config.py with ecu_framework/config/ package
  (loader.py + __init__.py re-exports). Public surface unchanged — every
  call site already uses 'from ecu_framework.config import ...' which
  works identically for a module and a package. Brings config into the
  same shape as lin/, power/, flashing/.
- Enrich loader.py with module-level design notes (pipeline diagram,
  precedence rationale, "known wart" callout) and inline "why" comments:
  the EcuTestConfig forward-reference quirk, the int(k, 0) hex-key trick,
  _deep_update's mutate-in-place semantics, and the reason the in-memory
  overrides are applied last despite being precedence #1.
- Add docs/23_config_loader_internals.md covering the merge semantics,
  type-coercion philosophy, dataclass ordering quirks, PSU side-channel,
  and the test-surface checklist (four places to touch when adding a
  new config field).
- Fix the now-stale ecu_framework/config.py path in 01_run_sequence.md
  and DEVELOPER_COMMIT_GUIDE.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 19:42:35 +02:00

195 lines
7.9 KiB
Markdown

# Configuration Loader Internals
This document explains *how* the configuration loader is implemented. For the
user-facing "what can I configure and where does it come from" perspective, see
[`02_configuration_resolution.md`](02_configuration_resolution.md). The two are
companions: `02` answers "what do I write in YAML?", this file answers "what
does the loader do with what I wrote?".
File: `ecu_framework/config/loader.py`
## Pipeline at a glance
```text
defaults (dict)
└─▶ merge YAML at $ECU_TESTS_CONFIG (if env set & file exists)
└─▶ merge YAML at <workspace>/config/test_config.yaml (if exists)
└─▶ merge PSU side-channel (env OWON_PSU_CONFIG or
<workspace>/config/owon_psu.yaml)
└─▶ merge in-memory overrides (caller-supplied)
└─▶ coerce types & build EcuTestConfig
```
Two layers run sequentially:
1. **Dict layer** — every source contributes a plain `dict`. They are merged
with `_deep_update` so nested sections combine key-by-key.
2. **Dataclass layer** — once merged, `_to_dataclass` casts the values to their
declared types and constructs `EcuTestConfig`. This is the boundary at which
YAML's type fuzziness stops.
Keeping the merge in the dict layer (rather than merging dataclasses) makes the
precedence story trivial: it's just a sequence of writes into one dict, and the
last writer wins.
## Precedence — and why it reads "backwards"
The `load_config` docstring lists precedence highest-to-lowest:
| Rank | Source | Where in code |
|---|---|---|
| 1 (highest) | `overrides` dict passed to `load_config` | Applied **last** |
| 2 | YAML at `$ECU_TESTS_CONFIG` | Applied if env points at an existing file |
| 3 | YAML at `<workspace>/config/test_config.yaml` | Fallback when env unset |
| 4 (lowest) | Built-in defaults | The starting `base` dict |
In the implementation, sources are *applied* in reverse order of that table
(lowest → highest). That's exactly what "highest precedence" means here:
each merge step overwrites earlier values for the same key, so the **last**
writer wins. The "1) ... 4)" comments inside `load_config` annotate by
precedence rank, not by call order.
## `_deep_update` — the merge semantics
```python
def _deep_update(base, updates):
for k, v in updates.items():
if isinstance(v, dict) and isinstance(base.get(k), dict):
base[k] = _deep_update(base[k], v)
else:
base[k] = v
return base
```
**Rules:**
- Dict-on-both-sides → recurse, so nested overlays don't clobber siblings.
This is what lets a YAML file override just `interface.bitrate` without
re-stating the rest of the `interface` block.
- Anything else (scalar, list, mismatched types) → replace wholesale.
- **Lists are replaced, not concatenated.** This is deliberate: list-concat
semantics surprise users who expect "set this list to X" to mean exactly that.
If concatenation is ever needed for a specific field, do it explicitly at the
call site, not in the merge primitive.
- Mutation happens in place; the return value is the same `base` object,
returned for chaining convenience (used when merging the PSU side-channel).
## `_to_dataclass` — defensive type coercion
YAML's type inference is generous: `"19200"` (quoted) comes through as a string,
`"true"` is not a bool, and hex-keyed mappings may arrive as either int or
string keys depending on the YAML writer. Rather than propagate that fuzziness,
the loader casts at the dataclass boundary:
```python
type=str(iface.get("type", "mock")).lower(),
channel=int(iface.get("channel", 1)),
bitrate=int(iface.get("bitrate", 19200)),
...
```
Casts that fail raise — and that's the right behavior. A config value that
can't be interpreted is a bug to surface early, not silently fall back from.
### Special-case: `frame_lengths` keys
`frame_lengths` maps a LIN frame ID (int) to a payload length (int). YAML can
write the key as a hex int (`0x0A`), a decimal int (`10`), or a quoted string
(`"0x0A"`). Coercion handles all three:
```python
key = int(k, 0) if isinstance(k, str) else int(k)
```
`int(k, 0)` with base `0` means "infer from prefix" — `"0x0A"` parses as hex,
`"10"` as decimal. Entries that fail to parse are skipped silently rather than
aborting the whole load, because one typo in a frame-length map shouldn't
prevent the rest of the configuration from coming up.
## PSU side-channel
Power-supply settings (COM port, baudrate, IDN substring) are typically
**bench-specific** and shouldn't be committed alongside test config. The loader
honors a dedicated overlay file just for the `power_supply` section:
- `$OWON_PSU_CONFIG` (env var → path) wins, else
- `<workspace>/config/owon_psu.yaml` if it exists.
This file is deep-merged into the existing `power_supply` block, so the main
YAML can still provide defaults (e.g. `idn_substr: OWON`) while the bench file
overrides only the parts that vary by machine. Recommended workflow:
```
config/test_config.yaml # committed; common defaults
config/owon_psu.yaml # gitignored; per-bench serial settings
```
## Dataclass schema quirks
### Forward reference: `EcuTestConfig.power_supply`
```python
@dataclass
class EcuTestConfig:
...
power_supply: "PowerSupplyConfig" = field(default_factory=lambda: PowerSupplyConfig())
@dataclass
class PowerSupplyConfig:
...
```
`PowerSupplyConfig` is referenced *before* it is defined. This works because:
1. `from __future__ import annotations` (PEP 563) turns *all* type annotations
into strings at module load time, so `"PowerSupplyConfig"` as an annotation
never triggers a name lookup.
2. The `default_factory` is a lambda, which defers evaluation of the bare name
`PowerSupplyConfig` until `EcuTestConfig()` is actually instantiated — by
which point the module body has finished executing and the name is bound.
The ordering is intentional: `EcuTestConfig` is the most-used type, so it lives
near the top of the file where readers find it first. If you ever drop the
`from __future__ import annotations` line, this ordering breaks; the lambda
default would still work, but the string annotation would need updating.
### Mutable defaults must use `default_factory`
`field(default_factory=dict)` (and `default_factory=InterfaceConfig`,
`default_factory=lambda: PowerSupplyConfig()`) is required because Python
shares default values across instances by default. Using `field(default={})`
on a dataclass field is a `ValueError` at class-creation time — the
`default_factory` form is the only correct way.
## Known wart: defaults live in two places
The defaults for every field exist twice:
1. As dataclass field defaults — e.g. `type: str = "mock"` on `InterfaceConfig`.
2. As entries in the `base` dict inside `load_config`.
Both must agree, and a drift between them would be silently wrong (the
loader's defaults would win for the YAML path, while the dataclass defaults
would win for callers that construct `InterfaceConfig()` directly).
Why it's still this way: the dict is needed because `_deep_update` operates on
dicts; the dataclass defaults are needed because callers may construct configs
directly without going through `load_config`. If a third construction path
appears, extract defaults to a single `DEFAULTS` mapping that both layers read
from.
## Test surface
Unit tests live in `tests/unit/test_config_loader.py`. They cover the
override precedence chain and the dataclass-construction defaults. When
adding a new field, add at minimum:
1. The dataclass field with a default.
2. The matching default in the `base` dict in `load_config`.
3. The matching cast line in `_to_dataclass`.
4. A unit test asserting it round-trips through `load_config(overrides=...)`.
Skipping (3) is the most common bug — the field will appear to work because
the dataclass default carries it, but YAML/env overlays for that field will
be silently dropped.