- Replace ecu_framework/config.py with ecu_framework/config/ package (loader.py + __init__.py re-exports). Public surface unchanged — every call site already uses 'from ecu_framework.config import ...' which works identically for a module and a package. Brings config into the same shape as lin/, power/, flashing/. - Enrich loader.py with module-level design notes (pipeline diagram, precedence rationale, "known wart" callout) and inline "why" comments: the EcuTestConfig forward-reference quirk, the int(k, 0) hex-key trick, _deep_update's mutate-in-place semantics, and the reason the in-memory overrides are applied last despite being precedence #1. - Add docs/23_config_loader_internals.md covering the merge semantics, type-coercion philosophy, dataclass ordering quirks, PSU side-channel, and the test-surface checklist (four places to touch when adding a new config field). - Fix the now-stale ecu_framework/config.py path in 01_run_sequence.md and DEVELOPER_COMMIT_GUIDE.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
195 lines
7.9 KiB
Markdown
195 lines
7.9 KiB
Markdown
# Configuration Loader Internals
|
|
|
|
This document explains *how* the configuration loader is implemented. For the
|
|
user-facing "what can I configure and where does it come from" perspective, see
|
|
[`02_configuration_resolution.md`](02_configuration_resolution.md). The two are
|
|
companions: `02` answers "what do I write in YAML?", this file answers "what
|
|
does the loader do with what I wrote?".
|
|
|
|
File: `ecu_framework/config/loader.py`
|
|
|
|
## Pipeline at a glance
|
|
|
|
```text
|
|
defaults (dict)
|
|
└─▶ merge YAML at $ECU_TESTS_CONFIG (if env set & file exists)
|
|
└─▶ merge YAML at <workspace>/config/test_config.yaml (if exists)
|
|
└─▶ merge PSU side-channel (env OWON_PSU_CONFIG or
|
|
<workspace>/config/owon_psu.yaml)
|
|
└─▶ merge in-memory overrides (caller-supplied)
|
|
└─▶ coerce types & build EcuTestConfig
|
|
```
|
|
|
|
Two layers run sequentially:
|
|
|
|
1. **Dict layer** — every source contributes a plain `dict`. They are merged
|
|
with `_deep_update` so nested sections combine key-by-key.
|
|
2. **Dataclass layer** — once merged, `_to_dataclass` casts the values to their
|
|
declared types and constructs `EcuTestConfig`. This is the boundary at which
|
|
YAML's type fuzziness stops.
|
|
|
|
Keeping the merge in the dict layer (rather than merging dataclasses) makes the
|
|
precedence story trivial: it's just a sequence of writes into one dict, and the
|
|
last writer wins.
|
|
|
|
## Precedence — and why it reads "backwards"
|
|
|
|
The `load_config` docstring lists precedence highest-to-lowest:
|
|
|
|
| Rank | Source | Where in code |
|
|
|---|---|---|
|
|
| 1 (highest) | `overrides` dict passed to `load_config` | Applied **last** |
|
|
| 2 | YAML at `$ECU_TESTS_CONFIG` | Applied if env points at an existing file |
|
|
| 3 | YAML at `<workspace>/config/test_config.yaml` | Fallback when env unset |
|
|
| 4 (lowest) | Built-in defaults | The starting `base` dict |
|
|
|
|
In the implementation, sources are *applied* in reverse order of that table
|
|
(lowest → highest). That's exactly what "highest precedence" means here:
|
|
each merge step overwrites earlier values for the same key, so the **last**
|
|
writer wins. The "1) ... 4)" comments inside `load_config` annotate by
|
|
precedence rank, not by call order.
|
|
|
|
## `_deep_update` — the merge semantics
|
|
|
|
```python
|
|
def _deep_update(base, updates):
|
|
for k, v in updates.items():
|
|
if isinstance(v, dict) and isinstance(base.get(k), dict):
|
|
base[k] = _deep_update(base[k], v)
|
|
else:
|
|
base[k] = v
|
|
return base
|
|
```
|
|
|
|
**Rules:**
|
|
|
|
- Dict-on-both-sides → recurse, so nested overlays don't clobber siblings.
|
|
This is what lets a YAML file override just `interface.bitrate` without
|
|
re-stating the rest of the `interface` block.
|
|
- Anything else (scalar, list, mismatched types) → replace wholesale.
|
|
- **Lists are replaced, not concatenated.** This is deliberate: list-concat
|
|
semantics surprise users who expect "set this list to X" to mean exactly that.
|
|
If concatenation is ever needed for a specific field, do it explicitly at the
|
|
call site, not in the merge primitive.
|
|
- Mutation happens in place; the return value is the same `base` object,
|
|
returned for chaining convenience (used when merging the PSU side-channel).
|
|
|
|
## `_to_dataclass` — defensive type coercion
|
|
|
|
YAML's type inference is generous: `"19200"` (quoted) comes through as a string,
|
|
`"true"` is not a bool, and hex-keyed mappings may arrive as either int or
|
|
string keys depending on the YAML writer. Rather than propagate that fuzziness,
|
|
the loader casts at the dataclass boundary:
|
|
|
|
```python
|
|
type=str(iface.get("type", "mock")).lower(),
|
|
channel=int(iface.get("channel", 1)),
|
|
bitrate=int(iface.get("bitrate", 19200)),
|
|
...
|
|
```
|
|
|
|
Casts that fail raise — and that's the right behavior. A config value that
|
|
can't be interpreted is a bug to surface early, not silently fall back from.
|
|
|
|
### Special-case: `frame_lengths` keys
|
|
|
|
`frame_lengths` maps a LIN frame ID (int) to a payload length (int). YAML can
|
|
write the key as a hex int (`0x0A`), a decimal int (`10`), or a quoted string
|
|
(`"0x0A"`). Coercion handles all three:
|
|
|
|
```python
|
|
key = int(k, 0) if isinstance(k, str) else int(k)
|
|
```
|
|
|
|
`int(k, 0)` with base `0` means "infer from prefix" — `"0x0A"` parses as hex,
|
|
`"10"` as decimal. Entries that fail to parse are skipped silently rather than
|
|
aborting the whole load, because one typo in a frame-length map shouldn't
|
|
prevent the rest of the configuration from coming up.
|
|
|
|
## PSU side-channel
|
|
|
|
Power-supply settings (COM port, baudrate, IDN substring) are typically
|
|
**bench-specific** and shouldn't be committed alongside test config. The loader
|
|
honors a dedicated overlay file just for the `power_supply` section:
|
|
|
|
- `$OWON_PSU_CONFIG` (env var → path) wins, else
|
|
- `<workspace>/config/owon_psu.yaml` if it exists.
|
|
|
|
This file is deep-merged into the existing `power_supply` block, so the main
|
|
YAML can still provide defaults (e.g. `idn_substr: OWON`) while the bench file
|
|
overrides only the parts that vary by machine. Recommended workflow:
|
|
|
|
```
|
|
config/test_config.yaml # committed; common defaults
|
|
config/owon_psu.yaml # gitignored; per-bench serial settings
|
|
```
|
|
|
|
## Dataclass schema quirks
|
|
|
|
### Forward reference: `EcuTestConfig.power_supply`
|
|
|
|
```python
|
|
@dataclass
|
|
class EcuTestConfig:
|
|
...
|
|
power_supply: "PowerSupplyConfig" = field(default_factory=lambda: PowerSupplyConfig())
|
|
|
|
@dataclass
|
|
class PowerSupplyConfig:
|
|
...
|
|
```
|
|
|
|
`PowerSupplyConfig` is referenced *before* it is defined. This works because:
|
|
|
|
1. `from __future__ import annotations` (PEP 563) turns *all* type annotations
|
|
into strings at module load time, so `"PowerSupplyConfig"` as an annotation
|
|
never triggers a name lookup.
|
|
2. The `default_factory` is a lambda, which defers evaluation of the bare name
|
|
`PowerSupplyConfig` until `EcuTestConfig()` is actually instantiated — by
|
|
which point the module body has finished executing and the name is bound.
|
|
|
|
The ordering is intentional: `EcuTestConfig` is the most-used type, so it lives
|
|
near the top of the file where readers find it first. If you ever drop the
|
|
`from __future__ import annotations` line, this ordering breaks; the lambda
|
|
default would still work, but the string annotation would need updating.
|
|
|
|
### Mutable defaults must use `default_factory`
|
|
|
|
`field(default_factory=dict)` (and `default_factory=InterfaceConfig`,
|
|
`default_factory=lambda: PowerSupplyConfig()`) is required because Python
|
|
shares default values across instances by default. Using `field(default={})`
|
|
on a dataclass field is a `ValueError` at class-creation time — the
|
|
`default_factory` form is the only correct way.
|
|
|
|
## Known wart: defaults live in two places
|
|
|
|
The defaults for every field exist twice:
|
|
|
|
1. As dataclass field defaults — e.g. `type: str = "mock"` on `InterfaceConfig`.
|
|
2. As entries in the `base` dict inside `load_config`.
|
|
|
|
Both must agree, and a drift between them would be silently wrong (the
|
|
loader's defaults would win for the YAML path, while the dataclass defaults
|
|
would win for callers that construct `InterfaceConfig()` directly).
|
|
|
|
Why it's still this way: the dict is needed because `_deep_update` operates on
|
|
dicts; the dataclass defaults are needed because callers may construct configs
|
|
directly without going through `load_config`. If a third construction path
|
|
appears, extract defaults to a single `DEFAULTS` mapping that both layers read
|
|
from.
|
|
|
|
## Test surface
|
|
|
|
Unit tests live in `tests/unit/test_config_loader.py`. They cover the
|
|
override precedence chain and the dataclass-construction defaults. When
|
|
adding a new field, add at minimum:
|
|
|
|
1. The dataclass field with a default.
|
|
2. The matching default in the `base` dict in `load_config`.
|
|
3. The matching cast line in `_to_dataclass`.
|
|
4. A unit test asserting it round-trips through `load_config(overrides=...)`.
|
|
|
|
Skipping (3) is the most common bug — the field will appear to work because
|
|
the dataclass default carries it, but YAML/env overlays for that field will
|
|
be silently dropped.
|