Concepts

Background material on architecture, design decisions, and Meridian integration boundaries.

Why meridian-tools exists

Google Meridian is the modelling engine. meridian-tools is the workflow layer around that engine. It makes a Meridian modelling project easier to review, rerun, compare, hand over, and refresh without forking or modifying Meridian.

The package exists for agency MMM work where a fitted model is not the only deliverable. A team also needs to defend why one specification was selected, show exactly what was run, preserve the artefacts needed for review, and rerun or compare the work after the original notebook state has disappeared.

meridian-tools is therefore not a replacement for Meridian. It is the operating standard around Meridian.

The problem it solves

A bare modelling workflow can fit a Meridian model, but an agency workflow has more obligations:

defend a model choice on out-of-sample grounds, not only visual fit;
declare validation before fitting, rather than after seeing results;
keep the authored configuration and the resolved execution configuration;
record which input data, library versions, and artefacts produced the result;
hand a stable run directory to another analyst, reviewer, or client team;
refresh or compare stored runs without relying on notebook memory.

Those obligations are not theoretical. They are the difference between “this model ran on my machine” and “this model can be reviewed six months later”.

The evaluation gap

In-sample fit metrics such as R² or MAPE are useful diagnostic summaries, but they are not sufficient model-selection criteria. They are computed on data the model has already seen. A more flexible specification can improve in-sample fit while degrading on future weeks.

Expected log predictive density (ELPD) estimates expected performance on data the model has not seen. That is the relevant question when choosing between candidate MMM specifications: which specification is most likely to generalise?

meridian-tools adds a compatibility-aware model-selection layer on top of Meridian and ArviZ:

Need	`meridian-tools` surface
Compute PSIS-LOO	`compute_loo(...)`
Compute WAIC	`compute_waic(...)`
Compare candidate models	`compare_models(...)`
Inspect pointwise reliability	`loo_pointwise.csv` with `pareto_k`
Record incompatibility	`model_selection_status.json` with a reason code
Record ArviZ warnings	`model_selection_warnings.json`

The important integration point is log-likelihood reconstruction. ArviZ needs a pointwise log_likelihood group before it can compute LOO or WAIC. A fitted Meridian model does not expose that group as a ready-to-use workflow artefact. meridian-tools reconstructs it for supported Meridian versions and passes a temporary InferenceData copy to ArviZ. The original Meridian model is not mutated.

Model comparison is still a statistical judgement. In the comparison table, elpd_diff must be read against dse, the standard error of the difference. If the ELPD difference is small relative to its uncertainty, prefer the simpler or more interpretable specification rather than treating rank as a mechanical decision rule.

A principled Bayesian workflow

meridian-tools keeps ownership explicit. Meridian remains responsible for the model. The wrapper owns the workflow controls around the model.

Workflow stage	Where it happens	Owner
Prior specification and contract validation	YAML config and `priors.py`	`meridian-tools`
Holdout planning before fitting	`validation` config and `cv.py`	`meridian-tools`
Posterior sampling	Meridian	Meridian
Meridian diagnostics	Meridian outputs staged by the runner	Meridian, packaged by `meridian-tools`
Out-of-sample information criteria	`compute_loo(...)`, `compute_waic(...)`	`meridian-tools`
Candidate model comparison	`compare_models(...)`	`meridian-tools`
Stored run refresh and comparison	`lifecycle.py`	`meridian-tools`

Declaring validation in configuration is itself a control. It prevents the holdout window from being chosen after results are known. blocked_tail executes one contiguous tail holdout. rolling_origin materialises multiple expanding-window splits through the Python API. In both cases, validation fits are kept separate from the final full-sample production fit.

Reproducibility and governance

Every completed run is a directory that can be inspected without the original notebook. The key files are:

Artefact	What it answers
`00_run_metadata/config.source.yaml`	What did the analyst author?
`00_run_metadata/config.resolved.yaml`	What did the system actually run?
`00_run_metadata/input_data_provenance.json`	Which data snapshot was loaded?
`run_manifest.json`	Which stages ran, with which versions and artefacts?
`30_model_assessment/*`	What diagnostics and model-selection evidence were produced?

The provenance record includes the authored path, resolved path, SHA-256 hash, row count, column count, ordered columns, file size, modification time, and provenance schema version. The manifest records meridian-tools and Meridian versions, timestamps, run status, stage status, and a validated inventory of top-level artefacts.

That gives a reviewer three concrete answers:

What was modelled? The archived source config and data provenance.
What did the system execute? The resolved config and manifest.
Can it be rerun or compared later? The run record, lifecycle helpers, seeds, pinned dependency boundary, and stored artefact paths.

New manifest version 4 runs also validate completed artefact paths before the final manifest is written. The validation requires recorded artefact paths to be relative paths under the run directory and to resolve to existing regular files. A completed manifest should not point to missing files or paths outside the run directory.

Why not modify Meridian directly?

Meridian remains an unmodified upstream dependency. The current supported boundary pins google-meridian[schema]==1.5.3 and constrains related runtime dependencies where needed for that version.

A private fork or local patch set would increase upgrade risk, complicate support, and blur the boundary between Google’s modelling library and agency-specific workflow requirements. Keeping the split explicit means a Meridian upgrade is a version change, a compatibility review, and a release gate, not a merge from a private modelling fork.

This is also the right separation of concerns. Meridian should focus on the MMM model. meridian-tools should focus on the repeatable agency workflow around that model.

Capability map

This table describes the repository’s supported Meridian 1.5.3 workflow boundary. It is not a claim about every possible Meridian usage pattern or future Meridian release.

Capability	Meridian role	`meridian-tools` role
Core MMM fitting	Owns the model and posterior sampling	Delegates to Meridian
Model serialisation	Provides schema serialisation	Stages `meridian_model.binpb` in the run directory
Project configuration	Accepts model inputs through Meridian APIs	Provides a typed YAML project surface
Validation planning	Accepts holdout masks	Builds blocked-tail, rolling-origin, and authored-holdout run specs
Diagnostics	Provides diagnostic outputs and plots	Packages diagnostics into stable artefacts
LOO and WAIC	Supplies fitted model state	Reconstructs log likelihood and calls ArviZ
Model comparison	Not the workflow owner	Provides `compare_models(...)` and run artefacts
Run provenance	Not the workflow owner	Writes config archives, data provenance, and manifests
Stored run lifecycle	Not the workflow owner	Loads, compares, and refreshes stored run records
Client handoff	Analyst-owned without a wrapper	Provides a predictable staged output directory

When should a team use this?

Use meridian-tools when at least one of these is true:

you need to compare candidate Meridian specifications;
the model will be refreshed later;
another analyst must review or rerun the work;
the result will be handed to a client or internal governance process;
you need a repeatable CLI or YAML workflow rather than notebook-only state.

Do not add the wrapper for its own sake. A single exploratory Meridian fit with no comparison, no refresh cadence, and no handoff requirement may not need this layer. The value appears when the model becomes part of an operating process. For a shorter decision note, see the adoption brief.

What this does not do

The boundary is deliberately narrow.

It does not replace Meridian’s model or sampler.
It does not compute convergence diagnostics such as R-hat, ESS, or divergences itself. It stages Meridian diagnostic outputs and adds workflow artefacts around them.
It does not provide a full prior-predictive checking policy. It can request Meridian prior sampling through fit.sample_prior_draws, validates configured prior contracts, and records resolved prior distributions, but it does not add a wrapper-owned pass/fail rule for prior predictive checks.
It does not make LOO or WAIC available for holdout-fitted models. Those runs record model_selection_status.json because comparing holdout-fit ELPD to full-fit ELPD would be statistically ambiguous.
It does not make a run perfectly reproducible across all hardware, dependency versions, random seeds, or future Meridian releases. It records the conditions needed for a bounded, reviewable rerun.

These limits are intentional. A wrapper that hides Meridian’s responsibilities would be harder to trust. meridian-tools is valuable because it makes the boundary visible.

The short version

Meridian gives the agency a modelling engine. meridian-tools gives the agency a repeatable scientific workflow around that engine: validation plans, model-selection evidence, provenance, manifests, stable artefacts, and lifecycle operations.

Adoption brief for agency teams

This page is the short version for teams deciding whether to use meridian-tools around Google Meridian.

Recommendation

Use meridian-tools when a Meridian model must be reviewed, compared, refreshed, or handed to another team. Do not use it merely because it exists. A one-off exploratory Meridian fit with no model comparison, no refresh cadence, and no handoff requirement may not need this layer.

What breaks without the wrapper?

Agency requirement	Risk without `meridian-tools`
Defensible model choice	Candidate specifications can be selected on in-sample fit or visual judgement alone.
Reviewable execution	The authored config, resolved config, data provenance, and tool versions can be scattered across notebooks and local state.
Client or internal handoff	The output set can depend on analyst habits rather than a stable directory contract.
Refresh cadence	A later run may not know exactly which config, data snapshot, validation window, or package versions created the earlier result.
Compatibility management	Meridian upgrades can silently affect wrapper-owned seams such as schema serialisation or log-likelihood reconstruction unless they are gated.

What does adoption cost?

Cost	Practical meaning
Dependency boundary	Use the supported Meridian-compatible environment rather than arbitrary package versions.
YAML config	Each project needs one reviewed config that defines data mapping, model settings, fit settings, validation, and exports.
Run directory convention	Analysts inspect staged artefacts under a predictable output directory rather than ad hoc notebook outputs.
Validation discipline	Holdout choices are declared before fitting and validation fits are not reused as final production fits.

What does the team get?

Need	Evidence produced
What was modelled?	`config.source.yaml` and input-data provenance.
What actually ran?	`config.resolved.yaml` and `run_manifest.json`.
Which model should we prefer?	LOO/WAIC summaries, pointwise diagnostics, and `model_comparison.csv` for compatible final fits.
Can another analyst inspect it?	Staged assessment, decomposition, response-curve, optimisation, and manifest artefacts.
Can it be refreshed later?	Lifecycle helpers that load, compare, and refresh stored run records.

What does it not do?

meridian-tools does not replace Meridian’s model, sampler, or diagnostics. It does not make every run perfectly reproducible across all hardware and future dependency versions. It records the conditions needed for a bounded, reviewable rerun.

It also does not make LOO or WAIC meaningful for holdout-fitted models. Those runs deliberately record model_selection_status.json instead.

Decision rule

Adopt the wrapper when the model is part of an operating process. Skip it when the work is a disposable exploration.

If the model is going to a client, an internal reviewer, or a future refresh, the wrapper is not overhead. It is the audit trail.

Architecture

meridian-tools is a companion package designed for agency teams that use Google Meridian as their client-facing MMM (Marketing Mix Modelling) engine. It provides a stricter, more reproducible workflow around Meridian without forking the upstream library.

Core philosophy

No forking — meridian-tools strictly wraps Meridian. It does not modify Meridian’s internal code or model implementations.
Bounded reproducibility — Runs are driven by typed YAML configurations, archived source/resolved configs, manifest metadata, and input-data provenance. These records support repeatable execution in the documented dependency environment, but they do not guarantee identical posterior draws across all hardware, dependency versions, random seeds, or Meridian changes.
Structured workflow — The package enforces a staged execution pipeline (validation, model fit, assessment, decomposition, response curves, optimisation).
Lifecycle management — Runs are treated as immutable artefacts with rich metadata, allowing for easy comparison, refreshing, and storage.

Module map

meridian_tools/
├── __init__.py          Lazy-loading package exports
├── artifacts.py         Manifest and JSON helpers
├── cli.py               CLI entry point (argparse)
├── config.py            Pydantic YAML models
├── cv.py                Validation split logic
├── demo.py              Bundled demo discovery
├── diagnostics.py       Diagnostics export
├── exports.py           Meridian analysis surface wrappers
├── launcher.py          Run execution wrapper
├── lifecycle.py         Post-run record management
├── log_likelihood.py    Log-likelihood reconstruction adapter
├── model_selection.py   ArviZ LOO/WAIC wrappers
├── terminal.py          CLI presentation and warning grouping
└── version.py           Static version

Layered import design

Meridian and TensorFlow are never imported at module level in the configuration, validation, or CLI layers. This means lightweight operations respond instantly:

Operation	Imports loaded
`meridian-tools --help`	`pydantic`, `yaml`
`load_yaml_config(path)`	`pydantic`, `yaml`
`build_validation_plan(...)`	`numpy`
`run_pipeline(...)`	Everything (Meridian, TF, ArviZ, etc.)

The __init__.py uses __getattr__-based lazy loading so that import meridian_tools does not trigger heavy dependency imports.

Pipeline execution model

The runner executes stages sequentially. Each stage:

Creates a StageRecord and appends it to the in-memory manifest.
Calls the stage function, which returns a dict[str, Path] of artefacts.
Normalises artefact paths to be relative to the run directory.
Writes the updated manifest to disk.

This design means a crash mid-pipeline leaves a readable partial manifest on disk. The last entry in the stages array is the last successfully completed stage.

┌─────────────────────┐
│  00_run_metadata    │  Archive source + resolved configs
├─────────────────────┤
│  10_validation      │  Write validation spec (if applicable)
├─────────────────────┤
│  20_model_fit       │  Build data → build model → sample posterior
├─────────────────────┤
│  30_model_assessment│  Diagnostics + model selection + summary
├─────────────────────┤
│  40_decomposition   │  Summary metrics (NetCDF + CSV)
├─────────────────────┤
│  60_response_curves │  Response curves (if configured)
├─────────────────────┤
│  70_optimisation    │  Budget optimisation (if configured)
└─────────────────────┘

The numbering gap at 50 reserves space for future stages without renumbering.

Configuration architecture

The separation between authored YAML and runtime-only config is strict:

MeridianToolsConfig — Pydantic model for the YAML file. Owns project metadata, data paths, model spec, fit settings, validation strategy, and export switches.
PipelineRunConfig — Frozen dataclass for runtime options. Owns output directory, run name, and concrete validation spec.

The runner writes two config copies to each run directory:

config.source.yaml — Verbatim copy of the input YAML.
config.resolved.yaml — After relative path resolution. Never includes runtime-only fields.

Artefact path normalisation

All artefact paths in manifests are stored relative to the run directory and validated as regular files beneath that directory. New manifest version 4 runs reject absolute paths, lexical .. components, paths that resolve outside the run directory, directories, missing paths, and special files. This makes run directories portable while keeping manifest consumers fail-closed.

The lifecycle layer resolves accepted paths back to absolute paths at load time.

Meridian coupling boundaries

Coupling level	Modules	Surface used
Public API	`runner.py`, `exports.py`	`Meridian`, `ModelSpec`, `CsvDataLoader`, `Analyzer`, `Summarizer`, `BudgetOptimizer`
Semi-public	`log_likelihood.py`, `exports.py`	`model_context`, `inference_data`, `input_data`
Private	`log_likelihood.py`	`_get_joint_dist_unpinned`, `_prepare_latents_for_reconstruction`, `_reconstruct_posteriors`

The private-API coupling is confined to log_likelihood.py and wrapped in comprehensive error handling. See Meridian integration for details.

Data flow

Input — A typed YAML file defines the entire run scope.
Initialisation — The runner resolves the config and creates a timestamped run directory.
Execution — The pipeline steps through stages, maintaining a central state dictionary with the fitted model and intermediate results.
Export — Each stage writes specific artefacts to disk within the run directory.
Finalisation — The manifest is completed with status: "completed" and finished_at, locking the run state.
Lifecycle — Downstream processes or analysts consume artefacts or use lifecycle tools to compare, refresh, or audit runs.

Design decisions

This document records the key design decisions in meridian-tools and the reasoning behind them. It is intended for maintainers and contributors who need to understand why things are built the way they are.

No IID cross-validation

Decision: meridian-tools does not implement random-shuffle or naive k-fold cross-validation.

Reasoning: MMM data is time series. Random IID splits break temporal structure, leading to data leakage where future observations inform training and past observations appear in the test set. This produces optimistic accuracy estimates that do not reflect real-world forecasting performance.

The package provides two time-respecting alternatives:

Blocked tail — reserves the most recent observations as a single test block.
Rolling origin — expanding-window forward-chaining that respects temporal ordering at every split.

Non-overlapping rolling-origin test windows

Decision: step_size must equal test_size for rolling-origin splits.

Reasoning: Overlapping test windows would mean the same observation appears in multiple test sets. This violates the independence assumption needed for comparing validation scores across splits and complicates the interpretation of aggregate metrics. Non-overlapping windows ensure each observation is evaluated exactly once across the split plan.

Minimum two splits for rolling origin

Decision: build_rolling_origin_splits requires at least two splits.

Reasoning: A single rolling-origin split is functionally identical to a blocked-tail holdout and provides no comparative signal. If your data only supports one split, use blocked_tail instead — it communicates the intent more clearly.

Holdout restriction for model selection

Decision: LOO and WAIC are only available for models where holdout_id is None.

Reasoning: LOO and WAIC estimate expected log predictive density (ELPD) using the full observed likelihood surface. A model fitted with a holdout mask has a modified likelihood that excludes held-out observations. Computing LOO on this truncated likelihood would produce ELPD estimates that are not comparable to those from full-sample fits.

The correct workflow is:

Use validation splits for candidate evaluation.
Select the best specification based on holdout performance.
Refit the chosen specification on the full dataset.
Compute LOO/WAIC on the full-sample fit for model quality reporting.

Separation of validation fits and final fits

Decision: Validation runs and final production fits are separate pipeline executions that produce separate run directories.

Reasoning: A validation fit is trained on a subset of the data. Its posterior reflects that subset and should not be used as the production artefact. Keeping them as separate runs prevents accidental use of a validation fit for downstream analysis or reporting.

Lazy imports for CLI responsiveness

Decision: Heavy dependencies (TensorFlow, NumPy, Meridian, ArviZ) are not imported at module level in the config, CLI, or validation layers.

Reasoning: TensorFlow alone takes several seconds to import. The CLI must respond instantly for --help and --list operations. The __init__.py uses __getattr__-based lazy loading, and the test suite verifies that build_parser() only loads pydantic and yaml.

Pydantic `extra="forbid"` everywhere

Decision: All configuration models reject unexpected keys.

Reasoning: Silent acceptance of unknown keys is a common source of misconfiguration in YAML-driven tools. A typo like export_pridictive_accuracy would be silently ignored without extra="forbid", leading to unexpected default behaviour. Strict rejection catches these errors at config load time with clear error messages.

Relative artefact paths in manifests

Decision: All artefact paths in run_manifest.json are stored relative to the run directory.

Reasoning: Absolute paths would tie run directories to a specific machine or filesystem layout. Relative paths make run directories portable — they can be copied, archived, or moved between machines without breaking the manifest contract.

Non-destructive lifecycle operations

Decision: refresh_run creates a new sibling directory rather than overwriting the source.

Reasoning: Overwriting a validated production run would destroy the audit trail. Creating a sibling preserves the original for comparison and rollback. The lifecycle layer explicitly validates that source directories are not mutated by refresh operations.

Manifest-per-stage persistence

Decision: The manifest is written to disk after each stage completes, not only at the end of the pipeline.

Reasoning: MCMC sampling can run for minutes to hours. If the process crashes or is killed during a later stage, the partial manifest on disk reflects what completed successfully. This aids debugging and allows partial runs to be inspected without special tooling.

Stage numbering with gaps

Decision: Pipeline stages use numbers 00, 10, 20, 30, 40, 60, 70 with a gap at 50.

Reasoning: The gaps allow future stages to be inserted at natural positions (e.g. a stage 50 for custom analysis) without renumbering existing stages. Renumbering would break backward compatibility with stored manifests and any downstream tooling that references stage names.

Config source vs. resolved archival

Decision: Both the verbatim source YAML and the resolved YAML are archived in every run directory.

Reasoning: The source YAML shows what the analyst authored (including relative paths). The resolved YAML shows the runtime interpretation (absolute paths, defaults applied). Both are needed for reproducibility:

The source is needed to understand intent.
The resolved config is needed to reproduce the exact execution.

Runtime-only fields (output_dir, run_name, validation_spec) are deliberately excluded from the resolved config because they are not part of the reproducible model specification.

Structured model selection errors

Decision: Model selection failures produce ModelSelectionError with a machine-readable reason_code rather than generic exceptions.

Reasoning: The pipeline needs to distinguish between “model selection is not possible for this run type” (expected) and “something is broken” (unexpected). Structured reason codes allow:

The runner to write model_selection_status.json without failing the run.
The lifecycle layer to compare model selection availability across runs.
Downstream consumers to programmatically handle different failure modes.

Meridian integration

This document describes how meridian-tools integrates with Google Meridian, the boundaries of that integration, and the risks associated with different coupling levels.

Integration philosophy

meridian-tools wraps Meridian without forking it. Meridian remains the modelling engine; meridian-tools adds workflow orchestration, validation, diagnostics bundling, model selection, and lifecycle management on top.

This approach means:

Meridian upgrades can be adopted without merging fork changes.
The upstream project’s API stability directly affects meridian-tools.
Any use of Meridian-internal APIs must be explicitly managed.

Coupling levels

Public API (low risk)

These are documented, versioned Meridian surfaces:

Surface	Used by
`Meridian` (model class)	`runner.py`
`ModelSpec`	`runner.py`
`CsvDataLoader`, `CoordToColumns`	`runner.py`
`Analyzer`	`exports.py`, `diagnostics.py`
`Summarizer`	`exports.py`
`BudgetOptimizer`	`exports.py`
`ModelReviewer`	`diagnostics.py`
`MediaEffects`, `MediaSummary`, `ModelDiagnostics`, `ModelFit`	`exports.py`
`save_meridian` (schema serde)	`exports.py`

These are unlikely to break without a Meridian major version bump. The exact google-meridian==1.5.3 pin keeps these assumptions aligned with the validated release baseline.

Semi-public API (medium risk)

These are accessible attributes on Meridian model objects that are used but not formally documented as stable:

Surface	Used by	Purpose
`model.inference_data`	`log_likelihood.py`, `model_selection.py`	Access ArviZ InferenceData
`model.model_context`	`log_likelihood.py`, `exports.py`	Access model structure
`model.input_data`	`exports.py`	Access input data for spend computation
`model.posterior_sampler_callable`	`log_likelihood.py`	Access posterior sampler

These are stable in practice (they are used by Meridian’s own analysis surfaces) but are not guaranteed to be stable across versions.

Private API (high risk)

These are _-prefixed methods on Meridian’s posterior_sampler_callable, used exclusively in log_likelihood.py for log-likelihood reconstruction:

_get_joint_dist_unpinned
_prepare_latents_for_reconstruction
_reconstruct_posteriors

These methods are Meridian-internal and may change or be removed in any Meridian release, including patch versions. They are necessary because Meridian does not provide a public API for pointwise log-likelihood computation.

Risk mitigation

Compatibility guard

log_likelihood.py checks for the presence of all three private methods before attempting reconstruction:

required_sampler_methods = (
    "_get_joint_dist_unpinned",
    "_prepare_latents_for_reconstruction",
    "_reconstruct_posteriors",
)
if any(not hasattr(posterior_sampler, method) for method in required_sampler_methods):
    raise ModelSelectionError(
        "...",
        reason_code="meridian_internal_seam_incompatible",
    )

If any method is missing, the error is caught and recorded as a model_selection_status.json artefact with reason_code: meridian_internal_seam_incompatible. The rest of the pipeline continues normally.

Graceful degradation

Model selection incompatibility is non-fatal at every level:

log_likelihood.py raises ModelSelectionError with a structured code.
model_selection.py propagates the error.
runner.py catches it, writes model_selection_status.json, and continues.
The manifest records the assessment stage as completed.
The lifecycle layer can inspect model_selection_status to understand why model selection was unavailable.

Version pinning

The pyproject.toml pins Meridian to google-meridian[schema]==1.5.3 and constrains protobuf to >=5.28.0,<7 for Meridian schema serialisation. Any Meridian or protobuf-bound upgrade must refresh the private log-likelihood reconstruction and schema-save baselines before the guards are relaxed.

Integration testing

The test suite includes a gated live Meridian verification command:

MERIDIAN_TOOLS_ENABLE_REAL_FIT=1 pytest tests/test_demo_integration.py::test_real_pipeline_refresh_smoke tests/test_log_likelihood.py::test_compute_log_likelihood_dataset_real_posterior_smoke -m real_fit -v

This command proves two different real seams:

one reduced real pipeline run over bundled demo data, including stored-run refresh after the original YAML is removed
the lower-level live log-likelihood reconstruction path

It is excluded from the default test suite because it requires real MCMC sampling, but it should be run after every Meridian version upgrade.

Constants dependency

log_likelihood.py uses Meridian constants for posterior parameter names:

from meridian import constants
# constants.BETA_GM, constants.TAU_G, constants.ETA_M, etc.

These are stable string constants but are not versioned. A Meridian release that renames these constants would cause import-time failures.

Unsaved posterior parameter recovery

Meridian does not persist all posterior parameters to InferenceData. The _recover_unsaved_state function in log_likelihood.py reconstructs:

tau_g_excl_baseline — Recovered from the posterior’s tau_g variable by slicing out the baseline geo index (concatenating the elements before and after baseline_geo_idx).
Geo deviations — Recovered from the posterior by solving deviation = (target - base) / scale for normal effects, or deviation = (log(target) - base) / scale for log-normal effects, with a scale == 0 guard that maps to zero.

This recovery is mathematically correct for the supported model families (log-normal and normal media effects). It is tested against both geo-panel and national models in test_log_likelihood.py.

What breaks on a Meridian upgrade

Change type	Impact	Detection
Public API signature change	`runner.py`, `exports.py` break	Default test suite
Semi-public attribute rename	`log_likelihood.py`, `exports.py` break	Default test suite
Private method removal/rename	Model selection disabled	Live smoke test or `model_selection_status.json`
Constant rename	Import-time failure	Default test suite
New posterior parameter	Log-likelihood may be incorrect	Manual review + live smoke test
Changed likelihood formula	Log-likelihood may be incorrect	Live smoke test

Recommended upgrade procedure

Pin the new Meridian version in a branch.
Run the canonical verification gate: python scripts/verify_release.py.
Run the live Meridian verification command: MERIDIAN_TOOLS_ENABLE_REAL_FIT=1 pytest tests/test_demo_integration.py::test_real_pipeline_refresh_smoke tests/test_log_likelihood.py::test_compute_log_likelihood_dataset_real_posterior_smoke -m real_fit -v.
If model selection breaks, check model_selection_status.json for the reason code.
If private methods changed, update log_likelihood.py to match the new Meridian internals or accept graceful degradation.
Update docs/project/release-baseline.md with the new verified state.