Meridian Tools documentation

Companion tooling for Google Meridian MMM workflows. This documentation covers installation, configuration, validation strategies, model selection, lifecycle management, and the full API reference.

Getting started

Installation — prerequisites, install, and verification
Quickstart — first run in under five minutes

Guides

Task-oriented how-to guides for common workflows.

Configuration — authoring YAML configs
Validation — blocked-tail, rolling-origin, and authored-holdout strategies
Model selection — LOO, WAIC, and model comparison
Lifecycle — loading, comparing, and refreshing runs
Workflow — end-to-end YAML-to-artefact workflow
Demos — bundled demo workflows and output schema
Troubleshooting — common issues and solutions

Reference

Lookup-oriented documentation for precise details.

CLI reference — meridian-tools run and meridian-tools demo
YAML schema — complete field-level config reference
Manifest schema — run_manifest.json structure
Output schema — run directory layout and artefact inventory
Validation spec schema — validation_spec.json structure

Python API

meridian_tools.config — configuration models
meridian_tools.runner — pipeline orchestration
meridian_tools.cv — validation splits
meridian_tools.exports — artefact export
meridian_tools.diagnostics — diagnostics extraction
meridian_tools.model_selection — LOO and WAIC
meridian_tools.log_likelihood — log-likelihood reconstruction
meridian_tools.lifecycle — run record management
meridian_tools.artifacts — manifest and JSON helpers

Concepts

Background explanations for architecture and design choices.

Why meridian-tools exists — motivation, agency workflow rationale, and Meridian boundary
Adoption brief — short decision note for agency teams
Architecture — module map, layered imports, pipeline model
Design decisions — why things are built this way
Meridian integration — coupling boundaries and upgrade risks

Project

Contributor and governance documentation.

Contributing — dev setup, conventions, workflow
Acceptance checklist — local acceptance gate
Release baseline — current milestone state
Changelog — version history

Getting started

Install meridian-tools, run a demo, and get to the first staged output quickly.

Installation

Prerequisites

meridian-tools requires Python 3.11 or later and a working installation of Google Meridian with schema support.

Install Meridian first

Meridian is the upstream modelling engine. Install it before meridian-tools:

pip install "google-meridian[schema]==1.5.3"

If you are working from a local Meridian checkout:

pip install -e "/path/to/meridian[schema]"

Verify the install:

from meridian import version
print(version.__version__)

Install meridian-tools

From the source tree (recommended for development)

cd /path/to/meridian-tools
python -m pip install -c constraints/dev.txt -e ".[dev]"

The constraints file pins the supported development environment, including the Meridian-compatible ArviZ and Matplotlib line.

Editable install without dev extras

pip install -e .

Verify the install

meridian-tools --help

You should see the CLI help output listing the run and demo subcommands. This command is deliberately lightweight — it does not import TensorFlow, NumPy, or Meridian.

You can also verify in Python:

import meridian_tools
print(meridian_tools.__version__)

Runtime dependencies

The following are declared in pyproject.toml and installed automatically:

Package	Version bound
`google-meridian[schema]`	`==1.5.3`
`arviz`	`>=0.18.0, <0.20.0`
`pandas`	`>=2.2.0, <3`
`pydantic`	`>=2.8.0, <3`
`protobuf`	`>=5.28.0, <7`
`PyYAML`	`>=6.0.0, <7`
`vl-convert-python`	`>=1.7.0, <2`

TensorFlow is not a direct dependency of meridian-tools. It comes transitively through google-meridian.

Development extras

python -m pip install -c constraints/dev.txt -e ".[dev]"

This adds:

Package	Purpose
`pytest`	Test runner
`pytest-cov`	Coverage reporting
`ruff`	Linter and formatter
`mypy`	Static type checker

Troubleshooting

If meridian-tools --help fails with an import error, check that:

You are in the correct virtual environment.
Meridian is installed with the [schema] extra.
Python version is 3.11 or later: python --version.

If pip install -e . fails, ensure setuptools>=68.0.0 is available:

pip install --upgrade setuptools

See the troubleshooting guide for more common issues.

Quickstart

This guide takes you from a fresh install to your first completed run in under five minutes using the bundled demo data.

1. Run a bundled demo

List the available demos:

meridian-tools demo --list

Output:

timeseries
geo_panel

Run the timeseries demo:

meridian-tools demo timeseries

When run from the source checkout, this creates a dated run directory under runs/demos/. When run from an installed package, the default output root is ./runs/demos/ relative to your current working directory. Each demo produces a full staged output layout.

2. Inspect the run directory

After the demo completes, find the created run directory:

ls runs/demos/

You will see a directory like demo-timeseries_20260402_073500/. The name comes from the demo’s project.name (demo-timeseries) plus a timestamp. The bundled demos now default to full-sample fits, so LOO and WAIC outputs are available in the assessment stage by default. Inside:

demo-timeseries_20260402_073500/
  run_manifest.json
  00_run_metadata/
    config.source.yaml
    config.resolved.yaml
  20_model_fit/
    meridian_model.binpb
    fit_metadata.json
  30_model_assessment/
    diagnostics_bundle.json
    model_results_summary.html
    loo_summary.json
    waic_summary.json
  40_decomposition/
    summary_metrics.csv
    summary_metrics.nc
  60_response_curves/
    response_curves.csv
    response_curves.nc
  70_optimisation/
    optimisation_summary.html
    optimised_data.csv

3. Read the key outputs

Start with the manifest:

cat runs/demos/demo-timeseries_*/run_manifest.json | python -m json.tool | head -20

Check the diagnostics bundle:

cat runs/demos/demo-timeseries_*/30_model_assessment/diagnostics_bundle.json | python -m json.tool

View the model results summary by opening the HTML file in your browser:

# Linux
xdg-open runs/demos/demo-timeseries_*/30_model_assessment/model_results_summary.html

# macOS
open runs/demos/demo-timeseries_*/30_model_assessment/model_results_summary.html

Inspect the decomposition CSV:

head runs/demos/demo-timeseries_*/40_decomposition/summary_metrics.csv

4. Run your own config

Start from one of the repository templates if you are working from the source checkout:

cp templates/standard.yml project.yml

Then update data.path and the column names for your dataset. The repository also includes minimal.yml, media-priors.yml, and validation-blocked-tail.yml examples under templates/.

Or create a YAML config file manually (e.g. project.yml):

project:
  name: my-first-run

data:
  path: ./my_data.csv
  kpi_type: revenue
  coord_to_columns:
    time: week
    geo: market
    kpi: revenue
    media: [impressions_tv, impressions_search]
    media_spend: [spend_tv, spend_search]

fit:
  n_chains: 4
  n_adapt: 500
  n_burnin: 500
  n_keep: 1000
  seed: 42

validation:
  strategy: blocked_tail
  holdout_size: 8

exports:
  export_predictive_accuracy: true
  export_review_summary: true
  export_model_selection: true

Run it:

meridian-tools run --config project.yml --output-dir runs

5. Next steps

Read the configuration guide to understand every YAML section.
Read the validation guide to choose the right validation strategy.
Read the workflow guide for the full end-to-end agency workflow.
Read the demo guide for more detail on the bundled reference workflows.

Guides

Task-oriented workflow documentation for configuration, validation, demos, lifecycle, and troubleshooting.

Pages

Configuration guide — meridian-tools is driven by one YAML configuration file. This guide explains every section, its purpose, and its constraints. For a field-level schema reference, see yaml-schema.md.
Validation guide — This guide explains how to choose and configure validation strategies in meridian-tools. Validation is the process of evaluating a candidate model specification on held-out data before committing to a final production fit.
Model selection guide — This guide explains how meridian-tools supports Bayesian model selection using Leave-One-Out (LOO) cross-validation and the Watanabe-Akaike Information Criterion (WAIC). It covers when model selection is available, how to interpret the outputs, and how to compare multiple candidate models.
Lifecycle management guide — meridian-tools treats completed runs as immutable artefacts. The lifecycle module provides tools to load, compare, and refresh past runs without mutating them. This guide explains each lifecycle operation and when to use it.
Meridian Tools workflow guide — This guide shows the supported end-to-end agency workflow for meridian-tools. It starts with one YAML config, moves through candidate validation, separates the final full-sample fit from the validation runs, and ends with the artefacts you should hand over or inspect later. The examples in this guide stay inside the implemented package surface. They do not assume notebooks, dashboards, or unpublished helper scripts.
Meridian Tools demo guide — This is the canonical guide to the bundled meridian-tools demos. Use it when you want one safe, reproducible, end-to-end example without client data.
Troubleshooting — Common issues and solutions when working with meridian-tools.

Configuration guide

meridian-tools is driven by one YAML configuration file. This guide explains every section, its purpose, and its constraints. For a field-level schema reference, see yaml-schema.md.

Configuration philosophy

The YAML file owns the authored project definition: project metadata, data paths, model specification, fit settings, validation strategy, and export switches. Runtime-only values — output_dir, run_name, and concrete validation_spec — belong in PipelineRunConfig or CLI flags, not in the YAML file. This separation ensures that the same YAML file can drive multiple runs with different runtime options while remaining reproducible.

Minimal valid config

project:
  name: my-project

data:
  path: ./data.csv
  coord_to_columns:
    time: week

This is the smallest config that will pass validation. It uses defaults for everything else: no validation, all exports enabled, no response curves, no optimisation.

Templates

The repository includes commented starter configs under templates/. These templates all use the same canonical schema; they differ only in which optional sections are authored.

minimal.yml — smallest valid config.
standard.yml — typical model run with common data, model, fit, validation, export, and response-curve settings.
media-priors.yml — standard run with YAML-driven media priors.
validation-blocked-tail.yml — standard run with blocked-tail validation.

Copy the closest template into your project directory, update data.path and the authored column names, then run it with meridian-tools run --config.

Section reference

`project`

Top-level project metadata.

project:
  name: client-mmm        # Default: "meridian-project"

name — Human-readable project name. Used as the base for run directory names unless overridden by --run-name at runtime.

`data`

CSV data loader configuration. Maps directly to Meridian’s CsvDataLoader.

data:
  path: ./client_dataset.csv
  kpi_type: revenue                    # "revenue" (default) or "non-revenue"
  coord_to_columns:
    time: week
    geo: market                        # optional for national models
    kpi: revenue
    population: population
    media: [impressions_tv, impressions_search]
    media_spend: [spend_tv, spend_search]
    controls: [promo_flag, price_index]
  media_to_channel: null               # optional channel mapping overrides
  media_spend_to_channel: null
  reach_to_channel: null
  frequency_to_channel: null
  rf_spend_to_channel: null
  organic_reach_to_channel: null
  organic_frequency_to_channel: null

path — Path to the CSV data file. Relative paths are resolved against the directory containing the YAML config file, not the current working directory.
kpi_type — Either "revenue" or "non-revenue". Controls how Meridian interprets the KPI column.
coord_to_columns — Maps Meridian coordinate names to CSV column names. time is required. geo is optional (omit for national models).

`model_spec`

Raw keyword arguments forwarded to Meridian’s ModelSpec.

model_spec:
  kwargs:
    max_lag: 8
    media_prior_type: roi

kwargs — Dictionary passed through to ModelSpec(**kwargs). Supports any argument that Meridian’s ModelSpec accepts.
Special handling for holdout_id: if present in kwargs, the run is treated as an “authored holdout” validation run. See the validation guide for details.

Custom Media Priors

The optional priors subsection configures a focused YAML surface for media priors that would otherwise require Python code.

model_spec:
  kwargs:
    max_lag: 8
    media_prior_type: roi
  priors:
    roi_m:
      default:
        distribution: LogNormal
        loc: 0.2
        scale: 0.9
      channels:
        paid_search:
          distribution: TruncatedNormal
          loc: 3.0
          scale: 1.5
          low: 0.0
          high: 6.0
    alpha_m:
      distribution: Beta
      concentration0: 1.0
      concentration1: 2.0

Supported prior parameters are roi_m, mroi_m, and alpha_m. Each accepts either a scalar distribution or a default plus per-channel overrides. Channel override names must match media_to_channel values, not raw CSV column names.

Supported distributions are Normal, LogNormal, TruncatedNormal, and Beta. model_spec.priors and model_spec.kwargs.prior are mutually exclusive.

`fit`

Sampling configuration for Meridian posterior fitting.

fit:
  sample_prior_draws: null     # Optional prior-only sampling
  n_chains: 4                  # Number of MCMC chains
  n_adapt: 500                 # Adaptation steps per chain
  n_burnin: 500                # Burn-in steps per chain
  n_keep: 1000                 # Posterior samples to keep per chain
  seed: 20260331               # Reproducibility seed (int, list[int], or null)
  max_tree_depth: 10           # NUTS max tree depth
  max_energy_diff: 500.0       # NUTS max energy difference
  unrolled_leapfrog_steps: 1   # NUTS leapfrog steps
  parallel_iterations: 10      # TF parallel iterations

All fields have sensible defaults. Override only what you need.

seed — Accepts a single integer, a list of integers (one per chain), or null for non-deterministic sampling.
sample_prior_draws — If set, prior predictive samples are drawn before posterior sampling. This is optional and primarily for model diagnostics.

`validation`

Validation and holdout orchestration settings. See the validation guide for strategy selection advice.

# Option 1: No validation (default)
validation:
  strategy: none

# Option 2: Blocked tail
validation:
  strategy: blocked_tail
  holdout_size: 8

# Option 3: Rolling origin
validation:
  strategy: rolling_origin
  initial_train_size: 52
  test_size: 4
  step_size: 4          # Must equal test_size
  max_splits: 3         # At least 2

strategy — One of "none", "blocked_tail", or "rolling_origin".
holdout_size — Required for blocked_tail. Number of time periods to hold out from the end of the series.
initial_train_size, test_size — Required for rolling_origin.
step_size — Optional for rolling_origin. Must equal test_size if set. Defaults to test_size.
max_splits — Optional for rolling_origin. Must be at least 2.

Validation rules:

blocked_tail rejects rolling-origin parameters.
rolling_origin rejects holdout_size.
none rejects all holdout and rolling-origin parameters.
Legacy holdout_size without explicit strategy is rejected.

`exports`

Output switches for diagnostics and model-selection artefacts.

exports:
  use_kpi: false                       # Use KPI-based metrics
  batch_size: 1000                     # Batch size for Meridian analysis
  export_predictive_accuracy: true     # Write predictive_accuracy.csv
  export_review_summary: true          # Write review_summary.json
  export_model_selection: true         # Write LOO/WAIC outputs
  export_plots: true                   # Write PNG plot artefacts

All fields have defaults. If the entire exports section is omitted, all exports are enabled with default settings.

`response_curves`

Optional. If omitted, the response curves stage is skipped.

response_curves:
  spend_multipliers: [0.0, 0.5, 1.0, 1.5, 2.0]
  use_posterior: true
  by_reach: true
  use_optimal_frequency: false
  confidence_level: 0.9

spend_multipliers — Required. Non-empty list of non-negative floats.
confidence_level — Must be strictly between 0 and 1.

`optimisation`

Optional. If omitted, the optimisation stage is skipped.

optimisation:
  start_date: "2025-01-01"
  end_date: "2025-12-31"
  budget:
    mode: fixed_total                  # or "relative_reference_window_total"
    value: 1000000.0
  use_posterior: true
  use_optimal_frequency: true
  confidence_level: 0.9

start_date, end_date — ISO format YYYY-MM-DD. end_date must be on or after start_date.
budget.mode — Either "fixed_total" (absolute budget) or "relative_reference_window_total" (multiplier against the reference window’s total spend).
budget.value — Positive float. For fixed_total, this is the absolute budget. For relative_reference_window_total, this is a multiplier (e.g. 1.1 means 110% of the reference window total).

Validation strictness

All configuration models use Pydantic’s extra="forbid" mode. Any unexpected key in the YAML file will produce a clear validation error. This prevents silent misconfiguration from typos or outdated keys.

$ meridian-tools run --config bad.yml
# pydantic.ValidationError: 1 validation error for MeridianToolsConfig
# exports -> export_pridictive_accuracy
#   Extra inputs are not permitted

Path resolution

Relative paths in data.path are resolved against the directory containing the YAML config file, not the current working directory. This means:

# If config is at /workspace/configs/project.yml
data:
  path: ../inputs/weekly.csv
# Resolves to /workspace/inputs/weekly.csv

The resolved path is written to config.resolved.yaml in the run directory. The original authored path is preserved in config.source.yaml.

Wrapper-owned preflight

Before meridian-tools creates a dated run directory, it performs one narrow wrapper-owned preflight check on the authored config and the resolved input CSV. Phase 10 keeps this boundary intentionally small so the wrapper does not become a second Meridian schema layer.

The wrapper checks exactly:

the resolved data.path exists and is a regular file
the CSV header row can be read
the parsed header is non-empty
no parsed header cell is blank after trimming whitespace
every authored scalar entry in data.coord_to_columns exists in the header
every authored list member in data.coord_to_columns exists in the header
every authored key in data.media_to_channel exists in the header
every authored key in data.media_spend_to_channel exists in the header
every authored key in data.reach_to_channel exists in the header
every authored key in data.frequency_to_channel exists in the header
every authored key in data.rf_spend_to_channel exists in the header
every authored key in data.organic_reach_to_channel exists in the header
every authored key in data.organic_frequency_to_channel exists in the header
authored list-valued coord families are non-empty
authored mapping fields above are non-empty
coord_to_columns.media and media_to_channel must be authored together
coord_to_columns.media_spend and media_spend_to_channel must be authored together
coord_to_columns.reach, coord_to_columns.frequency, reach_to_channel, and frequency_to_channel must be authored together
coord_to_columns.rf_spend and rf_spend_to_channel must be authored together
coord_to_columns.organic_reach and organic_reach_to_channel must be authored together
coord_to_columns.organic_frequency and organic_frequency_to_channel must be authored together

Matching is exact and case-sensitive. The wrapper does not normalise headers, apply aliases, or use fuzzy matching.

What remains Meridian-owned:

deep ModelSpec semantics
fit-dependent tensor or shape constraints
statistical validity checks that depend on model construction or sampling

So Phase 10 moves obvious wrapper-detectable mistakes earlier, but it does not promise to catch everything Meridian may reject later.

Full example

project:
  name: client-mmm

data:
  path: ./client_dataset.csv
  kpi_type: revenue
  coord_to_columns:
    time: week
    geo: market
    kpi: revenue
    population: population
    media: [impressions_tv, impressions_search]
    media_spend: [spend_tv, spend_search]
    controls: [promo_flag, price_index]

model_spec:
  kwargs:
    max_lag: 8
    media_prior_type: roi

fit:
  n_chains: 4
  n_adapt: 500
  n_burnin: 500
  n_keep: 1000
  seed: 20260331

validation:
  strategy: blocked_tail
  holdout_size: 8

exports:
  export_predictive_accuracy: true
  export_review_summary: true
  export_model_selection: true

response_curves:
  spend_multipliers: [0.0, 0.5, 1.0, 1.5, 2.0]
  use_posterior: true
  by_reach: true
  use_optimal_frequency: false
  confidence_level: 0.9

optimisation:
  start_date: "2025-01-01"
  end_date: "2025-12-31"
  budget:
    mode: fixed_total
    value: 1000000.0
  use_posterior: true
  use_optimal_frequency: true
  confidence_level: 0.9

Validation guide

This guide explains how to choose and configure validation strategies in meridian-tools. Validation is the process of evaluating a candidate model specification on held-out data before committing to a final production fit.

Why validation matters for MMM

Marketing Mix Models are fitted to time series data. Unlike standard supervised learning, the temporal structure of the data means that naive IID cross-validation (random train/test splits) is statistically inappropriate. meridian-tools does not implement random shuffling or naive k-fold splits. Instead, it provides two time-respecting validation strategies and a clear separation between validation runs and the final production fit.

Validation strategies

`none` — No validation

validation:
  strategy: none

The model is fitted on the full dataset with no holdout. Use this when you do not need candidate evaluation — for example, when rerunning a previously validated specification.

`blocked_tail` — Single contiguous tail holdout

validation:
  strategy: blocked_tail
  holdout_size: 8

Reserves the last holdout_size time periods as a test block. The model is fitted on all preceding periods. This is the recommended default for short MMM time series where you want one simple candidate evaluation.

When to use: Most standard MMM projects with fewer than 150 weekly observations.

How it works:

Time axis: [t1, t2, t3, t4, t5, t6, t7, t8, t9, t10]
holdout_size: 3

Train: [t1, t2, t3, t4, t5, t6, t7]
Test:  [t8, t9, t10]

The holdout mask is generated automatically and injected into Meridian’s holdout_id parameter. For geo-panel models, the mask is broadcast across all geos.

`rolling_origin` — Expanding-window validation

validation:
  strategy: rolling_origin
  initial_train_size: 52
  test_size: 4
  step_size: 4
  max_splits: 3

Creates multiple expanding-window splits where each successive split adds more training data. This provides a more robust evaluation signal than a single blocked tail, but requires enough history to support multiple splits.

When to use: Projects with longer time series (typically 100+ weekly observations) where you want multiple evaluation windows.

How it works:

Time axis: [t1, t2, ..., t52, t53, ..., t56, t57, ..., t60]

Split 1: Train [t1..t52], Test [t53..t56]
Split 2: Train [t1..t56], Test [t57..t60]

Constraints:

step_size must equal test_size (non-overlapping test windows).
max_splits must be at least 2.
initial_train_size + test_size must not exceed the number of observations.
The plan must yield at least two splits.

`authored_holdout` — User-provided holdout mask

This is not a YAML strategy setting. Instead, you provide holdout_id directly in model_spec.kwargs:

model_spec:
  kwargs:
    holdout_id: [false, false, false, true, true]

When the runner detects an authored holdout_id in the YAML, it treats the run as an authored_holdout validation run. The mask is passed through to Meridian verbatim and recorded in the validation spec artefact.

When to use: When you need a specific holdout pattern that does not follow blocked-tail or rolling-origin conventions.

CLI vs Python API

Blocked tail from the CLI

blocked_tail runs directly from the CLI because they produce one run:

meridian-tools run --config project.yml --output-dir runs

Rolling origin requires the Python API

rolling_origin is a Python-first planning surface because it produces multiple runs — one per split plus a final fit. The CLI will reject direct rolling_origin execution:

# This will fail:
meridian-tools run --config project.yml  # with strategy: rolling_origin
# ValueError: cannot execute `rolling_origin` directly

Instead, use the Python API:

from pathlib import Path

import pandas as pd

from meridian_tools.config import PipelineRunConfig, load_yaml_config
from meridian_tools.cv import build_validation_plan
from meridian_tools.runner import run_pipeline

config_path = Path("project.yml")
config = load_yaml_config(config_path)

# Read the time index from your data
data_path = config.data.path
if not data_path.is_absolute():
    data_path = (config_path.parent / data_path).resolve()

frame = pd.read_csv(data_path)
time_column = config.data.coord_to_columns["time"]
geo_column = config.data.coord_to_columns.get("geo")

time_index = frame[time_column].drop_duplicates().tolist()
geo_index = None
if geo_column is not None:
    geo_index = frame[geo_column].drop_duplicates().tolist()

# Build the validation plan
validation_plan = build_validation_plan(
    config.validation,
    time_index=time_index,
    geo_index=geo_index,
)

# Execute each validation split. Each split fits only through the end of its
# own test window; future observations are excluded from that fit.
for run_spec in validation_plan.validation_runs:
    run_pipeline(
        PipelineRunConfig(
            config_path=config_path,
            output_dir=Path("runs"),
            validation_spec=run_spec,
        )
    )

Separating validation from the final fit

Validation runs and the final production fit are different jobs. First you evaluate candidate specifications on held-out splits. Then, once you have chosen the specification, you run a separate full-sample fit with no holdout.

Do not reuse a validation fit as the production artefact. The validation fit was trained on a subset of the data and its posterior reflects that subset.

Final fit after blocked tail

For blocked_tail, build_validation_plan provides a final_fit_run spec:

validation_plan = build_validation_plan(config.validation, time_index, geo_index)

# Run the final fit on all data
final_result = run_pipeline(
    PipelineRunConfig(
        config_path=config_path,
        output_dir=Path("runs"),
        validation_spec=validation_plan.final_fit_run,
    )
)

Final fit after rolling origin

The same pattern works for rolling origin:

# After running all validation splits...
final_result = run_pipeline(
    PipelineRunConfig(
        config_path=config_path,
        output_dir=Path("runs"),
        validation_spec=validation_plan.final_fit_run,
    )
)

The final_fit_run spec has mode="final_fit", strategy="none", and holdout_id=None. It trains on the full time axis with no holdout.

Run directory naming

The runner automatically appends a validation-aware suffix to the run name:

Scenario	Run name pattern
No validation	`<project_name>_<timestamp>`
Blocked tail	`<project_name>_blocked_tail_<timestamp>`
Rolling origin split 1	`<project_name>_split_01_<timestamp>`
Final fit	`<project_name>_final_fit_<timestamp>`
Authored holdout	`<project_name>_authored_holdout_<timestamp>`

Override the name with --run-name or PipelineRunConfig(run_name=...).

Validation spec artefact

Every validation-aware run writes a validation_spec.json artefact in the 10_validation/ stage directory. This JSON records:

mode — "validation" or "final_fit"
strategy — the validation strategy used
split_label — human-readable split identifier
holdout_source — "generated_validation", "authored_model_spec", or "none"
generated_holdout — whether the holdout mask was auto-generated
holdout_shape — shape of the holdout mask (without the actual data)
train_indices / test_indices — integer indices into the time axis
train_dates / test_dates — corresponding date values
validation_spec_version — current value 2
data_binding — source/execution coordinate fingerprints used to refresh or reject the split safely

The actual holdout mask is not stored in the JSON artefact (it can be large). It is reconstructed and injected into the model at runtime from the stored indices and the bound execution geometry.

For rolling-origin validation, the execution geometry is intentionally bounded: split 1 fits only through split 1’s test window, split 2 fits only through split 2’s test window, and so on. Later source observations are invisible to earlier validation fits.

Interaction with model selection

Bayesian model selection (LOO/WAIC) is only available for runs where holdout_id is None — meaning full-sample fitted models and final-fit runs. Validation fits and authored-holdout runs write a model_selection_status.json artefact instead of LOO/WAIC outputs. See the model selection guide for details.

The design intent is that validation and model selection answer different questions. Validation holds out declared time periods before fitting. LOO/WAIC compare compatible full-sample or final-fit candidates using reconstructed pointwise likelihood. Both controls support the same agency requirement: a model choice should be defensible after the notebook is gone. See why meridian-tools exists for the full rationale.

Model selection guide

This guide explains how meridian-tools supports Bayesian model selection using Leave-One-Out (LOO) cross-validation and the Watanabe-Akaike Information Criterion (WAIC). It covers when model selection is available, how to interpret the outputs, and how to compare multiple candidate models.

What model selection provides

Bayesian model selection uses information criteria computed from pointwise log-likelihood values to compare model specifications. Unlike predictive accuracy on a held-out set, LOO and WAIC evaluate the model’s expected predictive performance using the full posterior without requiring a separate validation split.

meridian-tools wraps ArviZ’s az.loo and az.waic with:

Automatic log-likelihood reconstruction for fitted Meridian models
Structured error handling when model selection is not possible
A compare_models surface for ranking multiple candidates
Artefact-level compatibility status in every run directory

For the broader rationale, see why meridian-tools exists. The short version is that model choice needs out-of-sample evidence, not only in-sample fit summaries.

Compatibility boundary

Model selection is only available for models where holdout_id is None. This means:

Run type	Model selection available
Full-sample fit (no validation)	Yes
Final-fit run (`mode: final_fit`)	Yes
Blocked-tail validation run	No
Rolling-origin validation split	No
Authored-holdout run	No
Bare `InferenceData` without `log_likelihood`	No

This restriction exists because LOO and WAIC require the full observed likelihood surface. A holdout fit has a modified likelihood that does not represent the full data generating process. Comparing a holdout fit’s ELPD against a full fit’s ELPD would be statistically meaningless.

How it works in the pipeline

When exports.export_model_selection: true in the YAML config, the runner’s 30_model_assessment stage attempts model selection after writing diagnostics.

Compatible runs

For compatible models, the stage writes:

loo_summary.json — LOO summary statistics (ELPD, p_loo, SE, etc.)
waic_summary.json — WAIC summary statistics
loo_pointwise.csv — Per-observation LOO values and Pareto k diagnostics
waic_pointwise.csv — Per-observation WAIC values
model_comparison.csv — Ranked comparison table (single-model for individual runs)
model_selection_warnings.json — Warning category/message/step and result flags when LOO, WAIC, or comparison emits warnings

Unavailable or degraded runs

For incompatible models or unexpected model-selection runtime/export failures, the stage writes a single status artefact:

model_selection_status.json

{
  "status": "unavailable",
  "reason_code": "holdout_fit_unsupported",
  "reason": "Model selection requires holdout_id is None ...",
  "warnings": []
}

Known reason codes:

Code	Meaning
`holdout_fit_unsupported`	The model was fitted with a holdout mask
`requires_fitted_meridian_model`	Missing posterior samples or ArviZ `InferenceData`
`missing_log_likelihood_group`	Bare `InferenceData` without reconstructable likelihood
`meridian_internal_seam_incompatible`	Meridian version lacks required internal reconstruction methods
`arviz_runtime_error`	ArviZ raised an unexpected runtime/value error
`export_runtime_error`	Writing model-selection JSON/CSV artefacts failed

Model-selection unavailability is non-fatal. The pipeline completes successfully and records the reason in the artefact.

Using the Python API directly

Compute LOO for a single model

from meridian_tools.model_selection import compute_loo

result = compute_loo(fitted_model, pointwise=True)

print(result.kind)          # "loo"
print(result.summary)       # {"kind": "loo", "elpd_loo": -123.4, ...}
print(result.pointwise)     # DataFrame with loo_i, pareto_k per observation

Compute WAIC for a single model

from meridian_tools.model_selection import compute_waic

result = compute_waic(fitted_model, pointwise=True)

print(result.kind)          # "waic"
print(result.summary)       # {"kind": "waic", "elpd_waic": -125.1, ...}

Compare multiple models

from meridian_tools.model_selection import compare_models

comparison = compare_models(
    {
        "model_a": fitted_model_a,
        "model_b": fitted_model_b,
    },
    ic="loo",   # or "waic"
)

print(comparison)
# DataFrame with columns: model, rank, elpd_loo, p_loo, elpd_diff, weight, se, dse, warning, scale

The comparison table is ranked by ELPD. The best model has rank 0 and elpd_diff == 0. The weight column gives stacking weights.

Worked comparison

The example below uses two small ArviZ InferenceData objects with log_likelihood groups. A fitted Meridian model follows the same path after meridian-tools reconstructs its pointwise log likelihood.

import arviz as az
import numpy as np

from meridian_tools.model_selection import compare_models


def idata_with_log_likelihood(seed: int) -> az.InferenceData:
    rng = np.random.default_rng(seed)
    return az.from_dict(
        posterior={"theta": rng.normal(size=(2, 200, 1))},
        log_likelihood={"y": rng.normal(loc=-1.0, scale=0.2, size=(2, 200, 8))},
    )


comparison = compare_models(
    {
        "baseline": idata_with_log_likelihood(1),
        "candidate": idata_with_log_likelihood(2),
    },
    ic="loo",
)
print(comparison)

Representative output:

model	rank	elpd_loo	p_loo	elpd_diff	weight	se	dse	warning	scale
baseline	0	-8.16	0.33	0.00	1.00	0.02	0.00	False	log
candidate	1	-8.18	0.32	0.02	0.00	0.03	0.04	False	log

The candidate has lower expected predictive performance in this example, but the difference is small relative to dse. That does not support a strong preference on predictive grounds. In that situation, prefer the simpler or more interpretable specification, or gather more evidence.

The stored geo-panel demo includes the same output schema in runs/demos/demo-geo-panel_20260424_172854/30_model_assessment/model_comparison.csv. That demo is a single-model run, so its elpd_diff and dse are both zero.

Check log-likelihood availability

from meridian_tools.model_selection import has_log_likelihood

if has_log_likelihood(fitted_model):
    result = compute_loo(fitted_model)

Log-likelihood reconstruction

Meridian does not store pointwise log-likelihood in its InferenceData by default. meridian-tools reconstructs it automatically when you pass a fitted Meridian model to compute_loo, compute_waic, or compare_models.

The reconstruction:

Recovers unsaved posterior parameters (e.g. geo deviations, tau_g)
Rebuilds the joint distribution from the posterior samples
Computes observation-level log-likelihood
Returns a new InferenceData with the log_likelihood group attached

The original model is never mutated. The reconstruction produces a temporary copy used only for the ArviZ computation.

You can also control this explicitly:

from meridian_tools.log_likelihood import attach_log_likelihood

# Returns new InferenceData with log_likelihood group (original unchanged)
idata_with_ll = attach_log_likelihood(fitted_model, in_place=False)

# Mutates the model's inference_data in place
attach_log_likelihood(fitted_model, in_place=True)

Interpreting the outputs

LOO summary

Field	Meaning
`elpd_loo`	Expected log pointwise predictive density (higher is better)
`p_loo`	Effective number of parameters
`se`	Standard error of `elpd_loo`
`warning`	Whether Pareto k diagnostics indicate unreliable estimates

WAIC summary

Field	Meaning
`elpd_waic`	Expected log pointwise predictive density (WAIC estimate)
`p_waic`	Effective number of parameters (WAIC estimate)
`se`	Standard error of `elpd_waic`
`warning`	Whether posterior variance diagnostics indicate unreliable estimates

Pareto k diagnostics

The pointwise LOO output includes a pareto_k column. ArviZ uses Pareto k to diagnose whether the PSIS-LOO approximation is reliable for each observation and sets the summary warning flag when its reliability checks fail. meridian-tools surfaces those values and warnings; it does not currently add its own separate thresholding policy.

Model comparison

When comparing two or more models:

elpd_diff — Difference in ELPD from the best model (0 for the best)
dse — Standard error of the ELPD difference
weight — Stacking weight (how much to trust each model)
Models are ranked by ELPD (rank 0 is best)

A single-model comparison returns a one-row table with rank=0, elpd_diff=0, and weight=1.0.

Error handling

All model-selection errors are raised as ModelSelectionError with a structured reason_code:

from meridian_tools.model_selection import ModelSelectionError, compute_loo

try:
    result = compute_loo(candidate)
except ModelSelectionError as exc:
    print(exc.reason_code)  # e.g. "holdout_fit_unsupported"
    print(str(exc))         # Human-readable explanation

In the pipeline, these errors are caught and written to model_selection_status.json rather than failing the run.

Lifecycle management guide

meridian-tools treats completed runs as immutable artefacts. The lifecycle module provides tools to load, compare, and refresh past runs without mutating them. This guide explains each lifecycle operation and when to use it.

Core concepts

Run records

A RunRecord encapsulates a run’s metadata and artefact paths. It is loaded from a run directory by reading run_manifest.json and resolving all artefact paths against the directory.

from meridian_tools.lifecycle import load_run_record

record = load_run_record("runs/my-project_blocked_tail_20260402_073500")

print(record.run_dir)                    # Path to the run directory
print(record.manifest)                   # RunManifest with stages, timestamps, versions
print(record.config_source_path)         # Path to config.source.yaml
print(record.config_resolved_path)       # Path to config.resolved.yaml
print(record.input_data_provenance_path) # Path to input_data_provenance.json (or None for older runs)
print(record.diagnostics_bundle_path)    # Path to diagnostics_bundle.json (or None)
print(record.validation_spec_path)       # Path to validation_spec.json (or None)
print(record.model_selection_status_path)  # Path to model_selection_status.json (or None)

All paths in the record are absolute. Required artefacts (config_source, config_resolved) are validated at load time and always present. input_data_provenance is also required for manifest version 3 and 4 runs. Optional artefacts (diagnostics_bundle, validation_spec, model_selection_status) are None if not present in the manifest.

Immutability

Lifecycle operations never modify a source run directory. When you refresh a run, the output goes to a new sibling directory. When you compare runs, both source directories remain untouched.

All lifecycle functions raise LifecycleError (a RuntimeError subclass) when they encounter invalid state.

Loading a run record

From a run directory

from meridian_tools.lifecycle import load_run_record

record = load_run_record("runs/my-project_blocked_tail_20260402_073500")

From a manifest path

record = load_run_record("runs/my-project_blocked_tail_20260402_073500/run_manifest.json")

Both forms are accepted. The function detects whether the argument is a directory or a manifest file.

Validation at load time

load_run_record validates:

The manifest JSON is well-formed and has a supported version (0, 1, 2, 3, or 4).
Required config artefact entries (config_source, config_resolved) exist in the manifest.
Manifest version 3 and 4 runs also include input_data_provenance.
Required artefact files actually exist on disk.
No artefact path escapes the run directory (path traversal protection).
Claimed optional artefacts exist on disk (a manifest that references a missing file is rejected).

If any check fails, a LifecycleError is raised with a descriptive message.

Listing run records

from meridian_tools.lifecycle import list_run_records

records = list_run_records("runs/")
for record in records:
    print(record.manifest.started_at, record.run_dir.name)

list_run_records discovers all direct child directories that contain a run_manifest.json and returns them sorted by started_at timestamp (most recent first), with run directory name as a secondary sort key.

The function requires a directory path (not a file). It will raise an error if any discovered run directory contains an invalid manifest — it does not silently skip broken runs.

Refreshing a run

Refreshing re-executes a run using its stored configuration but writes the output to a new directory. The source run is never modified.

When to refresh

After a Meridian upgrade — to check whether the new version produces comparable results with the same specification.
After a code change — to verify that refactoring did not change model outputs.
After extending the dataset — to refit the model with additional observations using the same validated specification.

How to refresh

from meridian_tools.lifecycle import build_refresh_run_config
from meridian_tools.runner import run_pipeline

refresh_config = build_refresh_run_config("runs/my-project_blocked_tail_20260402_073500")
result = run_pipeline(refresh_config)

build_refresh_run_config reconstructs a PipelineRunConfig from the source run’s stored configuration:

The execution config path points to the source run’s config.resolved.yaml.
The source config path points to the source run’s config.source.yaml, so the refreshed run preserves the original authored YAML in its own metadata.
The output directory is set to the source run’s parent directory (creating a sibling).
The run name suffix is stripped to produce a clean refresh name.
For validation runs, the validation spec is reconstructed from the stored validation_spec.json.

Refresh with overrides

You can override specific settings:

from pathlib import Path

refresh_config = build_refresh_run_config(
    "runs/my-project_blocked_tail_20260402_073500",
    output_dir=Path("runs/refreshed"),
    run_name="my-project-refresh",
)

Validation-aware refresh

If the source run was a validation run (blocked tail or rolling origin), build_refresh_run_config reconstructs the validation spec from the stored artefact, including the holdout mask geometry. For authored-holdout runs, it reuses the YAML-owned holdout from the copied config.

For final-fit runs, the refresh produces another final-fit run with the same full-sample training specification.

Comparing runs

from meridian_tools.lifecycle import compare_run_records

comparison = compare_run_records(
    "runs/my-project_blocked_tail_20260402_073500",
    "runs/my-project_blocked_tail_20260415_090000",
)
print(comparison)

compare_run_records accepts run directory paths (not RunRecord objects) and returns a pandas DataFrame with columns field, left, right, status, and changed. The compared fields include:

run_name and status — basic identity.
meridian_tools_version and meridian_version — version drift.
has_validation_spec and has_diagnostics_bundle — artefact presence.
predictive_accuracy_status and review_summary_status — diagnostics.
has_model_selection_outputs and model_selection_reason_code — model selection.
input_authored_path, input_resolved_path, input_sha256, input_size_bytes, input_mtime_utc, input_row_count, input_column_count, and input_ordered_columns — dataset identity and shape.

This is useful for auditing whether a refresh or a specification change produced materially different results.

If either run predates manifest version 3, provenance rows are reported with status == "legacy_unknown" and changed == None. That distinguishes “no stored provenance exists” from “the dataset definitely changed”.

Lifecycle workflow example

A typical lifecycle workflow for a quarterly model refresh:

from pathlib import Path
from meridian_tools.lifecycle import (
    load_run_record,
    list_run_records,
    build_refresh_run_config,
)
from meridian_tools.runner import run_pipeline

# 1. Find the most recent production run
records = list_run_records("runs/")
production_run = records[0]  # Most recent by started_at

# 2. Refresh with the updated dataset
refresh_config = build_refresh_run_config(
    production_run.run_dir,
    output_dir=Path("runs/quarterly-refresh"),
)
refresh_result = run_pipeline(refresh_config)

# 3. Compare the results
comparison = compare_run_records(production_run.run_dir, refresh_result.run_dir)
print(comparison)

Manifest versioning

The lifecycle layer supports manifest versions 0, 1, 2, 3, and 4. Older manifests are handled gracefully with default values for fields that were added in later versions. The current version is 4.

This means you can load run directories created by earlier versions of meridian-tools without issues. The loaded RunRecord keeps the same shape, but input_data_provenance_path is None for pre-v3 runs because those manifests predate provenance capture.

Manifest version 4 adds stricter run-completion and artefact-path integrity requirements for newly written runs. It does not remove loader support for older manifests.

Meridian Tools workflow guide

This guide shows the supported end-to-end agency workflow for meridian-tools. It starts with one YAML config, moves through candidate validation, separates the final full-sample fit from the validation runs, and ends with the artefacts you should hand over or inspect later. The examples in this guide stay inside the implemented package surface. They do not assume notebooks, dashboards, or unpublished helper scripts.

Before you start

Install Meridian first, then install meridian-tools in the same environment:

python -m pip install -c constraints/dev.txt -e ".[dev]"

Use the CLI for ordinary run execution. Use the Python API when you need rolling-origin planning, an explicit final-fit run, or lifecycle compare and refresh operations. Phase 07 does not provide a lifecycle CLI.

If you want packaged reference examples before authoring your own YAML, use the bundled demo guide in demos.md. The packaged demo launcher is meridian-tools demo .... The repo-root python runme.py ... wrapper remains available when you are working from a source checkout.

Author one YAML config

Keep the authored project definition in YAML. Keep runtime-only choices out of the YAML file. In practice, that means your source file owns the project metadata, data path, model specification, fit settings, validation settings, and export switches. Runtime-only values such as output_dir, run_name, and one concrete validation_spec belong in PipelineRunConfig or the CLI call, not in config.resolved.yaml.

Here is one exact blocked-tail config:

project:
  name: client-mmm

data:
  path: ./client_dataset.csv
  kpi_type: revenue
  coord_to_columns:
    time: week
    geo: market
    kpi: revenue
    population: population
    media: [impressions_tv, impressions_search]
    media_spend: [spend_tv, spend_search]
    controls: [promo_flag, price_index]

model_spec:
  kwargs:
    max_lag: 8
    media_prior_type: roi

fit:
  n_chains: 4
  n_adapt: 500
  n_burnin: 500
  n_keep: 1000
  seed: 20260331

validation:
  strategy: blocked_tail
  holdout_size: 8

exports:
  export_predictive_accuracy: true
  export_review_summary: true
  export_model_selection: true

Choose the right validation path

Use blocked_tail when you want one contiguous future block for candidate evaluation. This is often the right default for short MMM time series. Use rolling_origin when you have enough history to evaluate more than one expanding-window split. Do not treat rolling_origin as ordinary k-fold cross-validation. The package does not implement naive IID folds or random shuffling because that is not the right statistical workflow for MMM time series.

Validation runs and the final production fit are different jobs. First, you evaluate candidate specifications on blocked time splits. Then, once you have chosen the specification, you run a separate full-sample fit with no holdout.

Run one blocked-tail candidate from the CLI

Once the YAML file is authored, you can execute a blocked-tail candidate run directly through the CLI:

meridian-tools run --config project.yml --output-dir runs

The same packaged runner surface is available through the thin repo-root wrapper:

python runme.py run --config project.yml --output-dir runs

This command creates a dated run directory under runs/. If you need to change the output location or the visible run name, pass --output-dir or --run-name at execution time. Those are runtime-only overrides. They affect the run directory and manifest, but they do not become part of the authored YAML contract.

Plan and run rolling-origin validation through the Python API

rolling_origin is a Python-first planning surface because you need one concrete split at a time. Start with an explicit YAML definition:

validation:
  strategy: rolling_origin
  initial_train_size: 52
  test_size: 4
  step_size: 4
  max_splits: 3

Then materialise and execute the validation runs:

from pathlib import Path

import pandas as pd

from meridian_tools.config import PipelineRunConfig, load_yaml_config
from meridian_tools.cv import build_validation_plan
from meridian_tools.runner import run_pipeline

config_path = Path("project.yml")
config = load_yaml_config(config_path)

data_path = config.data.path
if not data_path.is_absolute():
    data_path = (config_path.parent / data_path).resolve()

frame = pd.read_csv(data_path)
time_column = config.data.coord_to_columns["time"]
geo_column = config.data.coord_to_columns.get("geo")

time_index = frame[time_column].drop_duplicates().tolist()
geo_index = None
if geo_column is not None:
    geo_index = frame[geo_column].drop_duplicates().tolist()

validation_plan = build_validation_plan(
    config.validation,
    time_index=time_index,
    geo_index=geo_index,
)

for run_spec in validation_plan.validation_runs:
    run_pipeline(
        PipelineRunConfig(
            config_path=config_path,
            output_dir=Path("runs"),
            validation_spec=run_spec,
        )
    )

build_validation_plan(...) gives you one concrete ValidationRunSpec per split. run_pipeline(...) remains the primitive that executes one actual run.

Run the final full-sample fit separately

After you have chosen the winning specification, run the final fit on the full sample. Do not reuse a validation fit as the production artefact.

from pathlib import Path

from meridian_tools.config import PipelineRunConfig
from meridian_tools.runner import run_pipeline

final_result = run_pipeline(
    PipelineRunConfig(
        config_path=Path("project.yml"),
        output_dir=Path("runs"),
        validation_spec=validation_plan.final_fit_run,
    )
)

print(final_result.run_dir)
print(final_result.manifest_path)

For rolling_origin and blocked_tail workflows, validation_plan.final_fit_run is the explicit no-holdout runtime spec. It keeps the boundary clear. Candidate validation and final production fitting are separate steps.

Know which artefacts matter for handoff

Each successful run directory is the handoff unit. The important files are:

run_manifest.json for stage status, versions, timestamps, and top-level artefact links
00_run_metadata/config.source.yaml for the authored source config
00_run_metadata/config.resolved.yaml for the YAML-owned config after path resolution
00_run_metadata/input_data_provenance.json for the exact dataset identity used by the run
10_validation/validation_spec.json when the run is validation-aware
30_model_assessment/diagnostics_bundle.json for stable diagnostics metadata
30_model_assessment/model_results_summary.html for the wrapped Meridian assessment summary
30_model_assessment/plots/ for assessment PNG plots such as model fit and rhat review
40_decomposition/summary_metrics.csv and summary_metrics.nc for decomposition exports
40_decomposition/plots/ for decomposition PNG plots
60_response_curves/plots/response_curves_plot.png when response-curve export is enabled
70_optimisation/plots/ when optimisation export is enabled
30_model_assessment model-selection outputs when the run is compatible, or 30_model_assessment/model_selection_status.json when it is not

Read those artefacts together. 30_model_assessment/diagnostics_bundle.json tells you whether predictive accuracy and review summary were exported or disabled. The assessment stage either contains the real Bayesian model-selection outputs or one explicit compatibility status artefact.

The supported Bayesian model-selection boundary is narrow and deliberate. The package supports fitted Meridian models where holdout_id is None. That means full-sample fitted models and explicit final-fit runs are compatible. Validation fits and authored holdout fits are not.

Use lifecycle helpers after a run exists

Once you have stored run directories, the lifecycle API lets you reload, compare, and refresh them without going back to notebook state.

from pathlib import Path

from meridian_tools.lifecycle import compare_run_records, load_run_record, refresh_run

validation_run_dir = Path("runs/client-mmm_blocked_tail_20260401_101500")
final_fit_run_dir = Path("runs/client-mmm_final_fit_20260401_114200")

final_fit_record = load_run_record(final_fit_run_dir)
comparison = compare_run_records(validation_run_dir, final_fit_run_dir)
refreshed = refresh_run(final_fit_run_dir, run_name="client-mmm_final_fit_refresh")

print(final_fit_record.manifest.run_name)
print(comparison)
print(refreshed.run_dir)

compare_run_records(...) gives you a metadata-level comparison. It does not attempt a raw-file diff across every output. refresh_run(...) rebuilds a new sibling run from the stored run-local artefacts. It does not overwrite the source run. Phase 07 does not provide lifecycle CLI commands, so use the Python API for these operations.

Know the staged output schema

The current run layout is:

<run_dir>/
  run_manifest.json
  00_run_metadata/
  10_validation/
  20_model_fit/
  30_model_assessment/
    plots/
  40_decomposition/
    plots/
  60_response_curves/
    plots/
  70_optimisation/
    plots/

The runner always writes:

00_run_metadata
20_model_fit
30_model_assessment
40_decomposition

The runner writes these only when applicable:

10_validation
60_response_curves
70_optimisation

For the bundled reference examples and the exact stage-level file set, see demos.md.

A practical analyst sequence

If you want one concrete operating pattern, use this one. Author a YAML file. Run a blocked-tail candidate through the CLI when you need one held-out tail block. Use rolling_origin through build_validation_plan(...) when you need multiple expanding-window validation splits. Choose the modelling specification. Run the final full-sample fit as its own job. Review the run directory artefacts. Then use compare_run_records(...) and refresh_run(...) when you need to inspect or rerun stored work later.

Meridian Tools demo guide

This is the canonical guide to the bundled meridian-tools demos. Use it when you want one safe, reproducible, end-to-end example without client data.

The public story is simple:

Meridian is the modelling engine.
meridian-tools is the workflow wrapper.
The bundled demos are launched through meridian-tools surfaces, not by calling Meridian directly.

What the bundled demos are for

Phase 08 adds two bundled reference workflows:

timeseries
- a national timeseries demo shipped as packaged demo data
geo_panel
- a geo-panel demo shipped as packaged demo data

Both datasets are bundled non-client reference data. They exist so analysts and stakeholders can inspect the workflow, run structure, and review artefacts without using client material.

What the package adds on top of Meridian

Meridian remains responsible for the modelling and analysis primitives. meridian-tools adds the operational surface that agencies usually need around it:

typed YAML configuration
blocked-tail and rolling-origin validation workflow
manifest-backed run directories
diagnostics bundling
compatibility-aware Bayesian model-selection outputs
lifecycle compare and refresh helpers
a thin demo launcher for bundled reference workflows

This is why the demos are useful. They show the wrapper workflow directly, rather than asking users to reconstruct it from notebooks or internal scripts.

Demo entrypoints

List the supported demos:

meridian-tools demo --list

Run the bundled timeseries demo:

meridian-tools demo timeseries

Run the bundled geo-panel demo:

meridian-tools demo geo_panel

By default, demo runs are written under runs/demos/. If you want a different root, pass --output-dir. If you want a custom visible run name, pass --run-name.

Example:

meridian-tools demo timeseries --output-dir sandbox/demo-runs --run-name demo-timeseries-review

The repo-root checkout wrapper remains available when you are working from the source tree:

python runme.py demo --list
python runme.py demo timeseries

The same package can also run an explicit authored config:

meridian-tools run --config /path/to/project.yml --output-dir runs

The repo-root wrapper can run an explicit authored config too:

python runme.py run --config /path/to/project.yml --output-dir runs

Bundled YAML surface

The bundled demo YAML files are real meridian-tools configs. They are not legacy Abacus-style placeholders.

The authored sections used in Phase 08 are:

project
data
model_spec
fit
validation
exports
response_curves
optimisation

The Phase 08 additions are:

response_curves
- required if you want the response-curve export stage to run
optimisation
- required if you want the optimisation export stage to run

The bundled demos include both sections so that the full staged schema is exercised.

The default demo configs use validation.strategy: none. That keeps the reference runs model-selection compatible, so LOO and WAIC outputs are written by default.

Output schema

Each successful demo run writes one manifest-backed staged directory layout:

<run_dir>/
  run_manifest.json
  00_run_metadata/
    config.source.yaml
    config.resolved.yaml
  20_model_fit/
    meridian_model.binpb
    fit_metadata.json
  30_model_assessment/
    diagnostics_bundle.json
    predictive_accuracy.csv
    review_summary.json
    model_results_summary.html
    plots/
      model_fit.png
      rhat_boxplot.png
    loo_summary.json
    waic_summary.json
    loo_pointwise.csv
    waic_pointwise.csv
    model_comparison.csv
    # or model_selection_status.json when unavailable
  40_decomposition/
    summary_metrics.nc
    summary_metrics.csv
    plots/
      channel_contribution_area_chart.png
      contribution_waterfall_chart.png
      spend_vs_contribution_chart.png
      roi_bar_chart.png
  60_response_curves/
    response_curves.nc
    response_curves.csv
    plots/
      response_curves_plot.png
  70_optimisation/
    optimisation_summary.html
    optimised_data.nc
    optimised_data.csv
    nonoptimised_data.nc
    nonoptimised_data.csv
    optimisation_grid.csv
    plots/
      incremental_outcome_delta_plot.png
      budget_allocation_optimised_plot.png
      budget_allocation_nonoptimised_plot.png
      spend_delta_plot.png
      optimisation_response_curves_plot.png

run_manifest.json stays top-level and remains the source of truth for artefact discovery, stage status, version metadata, and relative file paths.

Always exported versus config-gated outputs

For the current Phase 08 contract:

always exported for successful runs:
- 00_run_metadata
- 20_model_fit
- 30_model_assessment
- 40_decomposition
- exported only when applicable:
- 10_validation
  - written for validation-aware runs
  - skipped for runs with no validation metadata
- 60_response_curves
  - requires the authored response_curves section
- 70_optimisation
  - requires the authored optimisation section

Within 30_model_assessment, model selection remains compatibility-aware:

compatible runs write loo, waic, and comparison outputs
incompatible runs write model_selection_status.json
compatibility unavailability is non-fatal

How to read the important outputs

Start with these artefacts:

run_manifest.json
- run identity, versions, timestamps, stage status, and top-level artefact links
00_run_metadata/config.source.yaml
- the authored YAML
00_run_metadata/config.resolved.yaml
- the same YAML after runtime path resolution
10_validation/validation_spec.json
- validation provenance for validation-aware runs only
- not present in the default bundled demos because they run as full-sample fits
30_model_assessment/diagnostics_bundle.json
- the stable machine-readable record of diagnostics export state
30_model_assessment/model_results_summary.html
- the wrapped Meridian assessment summary
40_decomposition/summary_metrics.csv
- the easiest tabular decomposition output to inspect first

For model selection, keep the boundary honest:

LOO and WAIC are only available for compatible fitted Meridian models
validation fits and other incompatible cases will record model_selection_status.json instead
the package does not pretend unsupported runs have valid Bayesian comparison outputs
the bundled demos are configured as full-sample fits, so they should write loo_summary.json and waic_summary.json by default

For response curves and optimisation:

these outputs are useful for scenario and allocation review
they are not a substitute for checking diagnostics, validation provenance, or model-selection compatibility first

For visual review, each stage now keeps its PNG exports inside a local plots/ subdirectory rather than mixing image files into the stage root. That keeps the machine-readable exports and the human-review plots in one predictable place.

Recommended demo-reading sequence

If you are new to the repository, use this order:

run meridian-tools demo --list
run one of the bundled demos
open run_manifest.json
inspect 00_run_metadata/config.source.yaml
inspect 30_model_assessment/diagnostics_bundle.json
inspect 40_decomposition/summary_metrics.csv
inspect 60_response_curves/ and 70_optimisation/ if those stages ran

If you are working from a source checkout, python runme.py demo --list and python runme.py demo ... remain equivalent convenience wrappers.

That sequence shows the wrapper value quickly: one YAML config in, one structured run directory out, with the Meridian and meridian-tools artefacts kept in one predictable place.

Troubleshooting

Common issues and solutions when working with meridian-tools.

Installation issues

`meridian-tools --help` fails with ImportError

Cause: The package is not installed in the active environment, or Meridian is missing.

Fix:

python -m pip install -c constraints/dev.txt -e ".[dev]"

If Meridian is not installed:

pip install "google-meridian[schema]==1.5.3" "protobuf>=5.28.0,<7"

`RuntimeError: Saving meridian_model.binpb requires Meridian schema support`

Cause: Meridian was installed without the [schema] extra.

Fix:

pip install "google-meridian[schema]==1.5.3" "protobuf>=5.28.0,<7"

`RuntimeError: Saving PNG plots requires vl-convert-python`

Cause: The vl-convert-python package is not installed or not importable.

Fix:

pip install "vl-convert-python>=1.7.0,<2"

Configuration errors

`pydantic.ValidationError: Extra inputs are not permitted`

Cause: The YAML file contains a key that is not part of the schema. This is often a typo.

Fix: Check the key name against the YAML schema reference. All config models use extra="forbid", so unexpected keys are always rejected.

`Legacy holdout_size shorthand is no longer supported`

Cause: The YAML has validation.holdout_size without an explicit validation.strategy.

Fix: Add strategy: blocked_tail:

validation:
  strategy: blocked_tail
  holdout_size: 8

`validation.strategy: blocked_tail does not accept rolling-origin parameters`

Cause: The YAML mixes blocked_tail strategy with initial_train_size, test_size, or other rolling-origin fields.

Fix: Choose one strategy. Use blocked_tail with holdout_size only, or rolling_origin with its own parameters.

`optimisation.end_date must be on or after optimisation.start_date`

Cause: The dates in the optimisation section are reversed.

Fix: Ensure start_date precedes end_date:

optimisation:
  start_date: "2025-01-01"
  end_date: "2025-12-31"

`response_curves.spend_multipliers must not be empty`

Cause: The spend_multipliers list is empty or missing.

Fix: Provide at least one non-negative value:

response_curves:
  spend_multipliers: [0.0, 0.5, 1.0, 1.5, 2.0]

Pipeline execution errors

Dependency preflight failure

Cause: A required wrapper dependency check failed before config/data preflight or run-directory creation.

Common triggers:

google-meridian[schema] support is unavailable
exports.export_plots: true is set but vl-convert-python PNG support is unavailable

Fix: Install or repair the missing runtime dependency first, then rerun.

`ConfigPreflightError`

Cause: meridian-tools found a wrapper-owned config or input-data issue before run-directory creation.

Common triggers:

data.path resolves to a missing file or a directory
the CSV header row cannot be read
the header is empty or contains blank cells
an authored column name does not appear in the header exactly
a supported media/RF family is only half-authored

Fix: Correct the authored YAML or the input CSV first, then rerun. Header matching is exact and case-sensitive in Phase 10.

`ValidationExecutionContractError`

Cause: The requested single-run execution path is incompatible with the authored validation setup.

Common triggers:

you tried to run a rolling_origin config directly from the CLI or run_pipeline(...)
you passed PipelineRunConfig.validation_spec while the YAML already authors model_spec.kwargs.holdout_id

Fix: For rolling_origin, build a validation plan and execute one concrete split at a time through the Python API. For authored holdouts, either keep the YAML-authored holdout_id path or remove it before supplying a runtime validation_spec. See the validation guide for the full workflow.

`ModelSelectionError` with `reason_code: holdout_fit_unsupported`

Cause: LOO/WAIC was requested for a model fitted with a holdout mask.

Not a bug. Model selection is only available for full-sample fits. The pipeline records the incompatibility in model_selection_status.json and continues. See the model selection guide.

`ModelSelectionError` with `reason_code: meridian_internal_seam_incompatible`

Cause: The installed Meridian version does not expose the internal reconstruction methods needed for log-likelihood computation.

Fix: Check the Meridian and protobuf versions. This package requires google-meridian[schema]==1.5.3 and protobuf>=5.28.0,<7. If you recently upgraded Meridian, the private reconstruction seams may have changed. Check the Meridian integration notes.

Run fails mid-pipeline

If a run fails after the dated run directory already exists, meridian-tools raises PipelineRunFailure. The CLI and runme.py print the concrete failed run directory, manifest path, and stage name when available. The original exception is preserved as __cause__, so --traceback still shows the underlying failure.

The manifest is written to disk after each stage. If a run fails, the run_manifest.json is left on disk and marked failed. You can inspect it to determine which stage failed:

cat runs/my-project_*/run_manifest.json | python -m json.tool

Look at the stages array. A failed stage is recorded with status: "failed" and an error message.

Validation errors

`time_index must be strictly increasing with no duplicate values`

Cause: The time column in your data contains duplicates or is not sorted.

Fix: Ensure your CSV data has unique, monotonically increasing time values. For geo-panel data, the time column should be unique per time period (not per geo × time combination — the function expects the deduplicated time axis).

`rolling_origin must yield at least two splits`

Cause: The combination of initial_train_size, test_size, and data length does not produce enough splits.

Fix: Either reduce initial_train_size, reduce test_size, or use blocked_tail instead for shorter series.

`holdout_size must be smaller than the time axis`

Cause: The holdout size is greater than or equal to the number of time periods.

Fix: Reduce holdout_size to leave at least one training period.

Lifecycle errors

`LifecycleError` when loading a run record

Cause: The run manifest is missing required entries, references a file that does not exist, or has a malformed JSON structure.

Fix: Check that the run directory was not manually modified. Required artefacts are config.source.yaml and config.resolved.yaml. diagnostics_bundle.json is optional for loading but required for new runs.

Path traversal rejection

Cause: An artefact path in the manifest resolves outside the run directory.

Not fixable by editing the manifest. This is a security check. The manifest was likely corrupted or manually edited with an invalid path.

Performance issues

Pipeline takes very long

MCMC sampling (the 20_model_fit stage) dominates wall-clock time. The meridian-tools orchestration layer adds negligible overhead.

To speed up exploratory runs:

fit:
  n_chains: 2       # Fewer chains (minimum 1)
  n_adapt: 200      # Fewer adaptation steps
  n_burnin: 200     # Fewer burn-in steps
  n_keep: 500       # Fewer kept samples

For production runs, use the defaults or increase these values for better posterior quality.

Out-of-memory during model selection

Log-likelihood reconstruction loads the full posterior into memory and creates a temporary copy of the InferenceData. For large models, this can double memory usage temporarily.

Mitigation: Reduce n_keep or n_chains if memory is constrained.

Warnings

ArviZ Pareto k warnings

Estimated shape parameter of Pareto distribution is greater than 0.7 ...

This means the LOO approximation is unreliable for some observations. Check the pointwise pareto_k values in loo_pointwise.csv. Values above 0.7 indicate influential observations.

Meridian national model auto-zeroing warnings

Hierarchical distribution parameters must be deterministically zero for
national models. eta_orf has been automatically set to Deterministic(0).

This is expected for national (non-geo) models. Meridian automatically zeros out geo-level hierarchical parameters. The warning is informational.

TensorFlow deprecation warnings

These come from TensorFlow and Meridian internals. meridian-tools groups and deduplicates them in the terminal output to reduce noise. They do not indicate a problem with your run.

Reference

Lookup documentation for the CLI, YAML schema, manifest schema, output layout, and related contracts.

Pages

CLI reference — meridian-tools provides a command-line interface with two subcommands: run and demo.
YAML configuration schema reference — This is the complete field-level reference for meridian-tools YAML configuration files. For usage guidance, see the configuration guide.
Manifest schema reference — The run_manifest.json file is the source of truth for every meridian-tools run. It lives at the root of the run directory and records identity, timing, versions, overall status, top-level artefact index, and per-stage records.
Output schema reference — This page documents the complete run directory layout produced by meridian-tools. Every successful pipeline run creates a timestamped directory containing the artefacts described below.
Validation spec schema reference — The validation_spec.json artefact is written to 10_validation/ for every validation-aware pipeline run. It records the concrete validation provenance for that specific run, including the holdout strategy, split geometry, and date windows. Current runs write validation-spec version 2, which also binds the source coordinate fingerprints and the execution prefix used by the split.

CLI reference

meridian-tools provides a command-line interface with two subcommands: run and demo.

Global usage

meridian-tools <subcommand> [options]

`meridian-tools run`

Execute a meridian-tools pipeline run from an authored YAML config.

meridian-tools run --config <path> [--output-dir <dir>] [--run-name <name>] [--traceback]

Arguments

Argument	Required	Default	Description
`--config`	Yes	—	Path to the meridian-tools YAML configuration file.
`--output-dir`	No	`runs`	Directory where dated run folders will be created.
`--run-name`	No	`project.name` from YAML	Optional run name override.
`--traceback`	No	`false`	Show the full Python traceback on failure.

Examples

# Basic run
meridian-tools run --config project.yml

# Custom output directory
meridian-tools run --config project.yml --output-dir output/model_runs

# Named run with traceback on failure
meridian-tools run --config project.yml --run-name client-q1-review --traceback

Exit codes

Code	Meaning
`0`	Pipeline completed successfully.
`1`	Pipeline failed. Error details are printed to stderr. Use `--traceback` for the full stack trace.

Failure reporting

The CLI distinguishes five broad failure classes:

config loading or Pydantic validation failures before wrapper preflight
dependency preflight failures before run-directory creation
validation-execution contract failures before run-directory creation
wrapper-owned ConfigPreflightError failures before run-directory creation
PipelineRunFailure after the dated run directory already exists

Dependency preflight covers google-meridian[schema] support and optional plot-export support. Validation-execution contract failures cover incompatible single-run validation requests such as direct rolling_origin execution. Wrapper preflight covers only the closed config/data matrix documented in the configuration guide.

For PipelineRunFailure, the CLI prints the concrete failed run directory, manifest path, and stage name when available so the partial run can be inspected immediately. --traceback still shows the original underlying exception because it is preserved through __cause__.

Validation strategy restrictions

The CLI executes a single pipeline run. Configs with validation.strategy: rolling_origin cannot be run directly from the CLI because they require multiple sequential runs. Use the Python API for rolling-origin workflows.

Configs with strategy: none or strategy: blocked_tail work directly from the CLI.

`meridian-tools demo`

Run one of the bundled reference demos or list available demos.

meridian-tools demo [<name>] [--list] [--output-dir <dir>] [--run-name <name>] [--traceback]

Arguments

Argument	Required	Default	Description
`<name>`	Yes (unless `--list`)	—	Bundled demo name to execute. One of: `timeseries`, `geo_panel`.
`--list`	No	`false`	List supported demos and exit. Cannot combine with a demo name.
`--output-dir`	No	`runs/demos/` (source checkout) or `./runs/demos/` (installed)	Override the output root directory.
`--run-name`	No	None (uses `project.name` from the demo config)	Optional run name override.
`--traceback`	No	`false`	Show the full Python traceback on failure.

Examples

# List available demos
meridian-tools demo --list

# Run the timeseries demo
meridian-tools demo timeseries

# Run with a custom output directory
meridian-tools demo geo_panel --output-dir sandbox/demo-output

# Run with a custom name
meridian-tools demo timeseries --run-name demo-review-q2

Available demos

Name	Description
`timeseries`	National timeseries demo using bundled reference data.
`geo_panel`	Geo-panel demo using bundled reference data.

Both demos exercise the full staged pipeline including response curves and optimisation.

Lightweight import

The CLI is designed for fast startup. Running meridian-tools --help or meridian-tools demo --list does not import TensorFlow, NumPy, Meridian, or ArviZ. Heavy imports are deferred until pipeline execution begins.

Entrypoints

The primary CLI entrypoint is the console script registered in pyproject.toml:

[project.scripts]
meridian-tools = "meridian_tools.cli:main"

The supported module-path equivalent is:

python -m meridian_tools.cli run --config project.yml

The package-level form below is not a supported entrypoint:

python -m meridian_tools run --config project.yml

Source-tree wrapper

When working from the source checkout, runme.py provides equivalent functionality:

python runme.py run --config project.yml --output-dir runs
python runme.py demo timeseries
python runme.py demo --list

See the demo guide for more details on the runme.py wrapper.

YAML configuration schema reference

This is the complete field-level reference for meridian-tools YAML configuration files. For usage guidance, see the configuration guide.

All configuration models use Pydantic extra="forbid" — any key not listed here will produce a validation error.

Top-level structure

project: ProjectConfig         # optional, has defaults
data: CsvDataConfig            # required
model_spec: ModelSpecConfig    # optional, has defaults
fit: FitConfig                 # optional, has defaults
validation: ValidationConfig   # optional, has defaults
exports: ExportsConfig         # optional, has defaults
response_curves: ResponseCurvesConfig | null   # optional
optimisation: OptimisationConfig | null         # optional

`project`

Field	Type	Default	Description
`name`	`str`	`"meridian-project"`	Human-readable project name. Used as the base for run directory names.

`data`

Field	Type	Default	Description
`path`	`Path`	required	Path to CSV data file. Relative paths resolve against the YAML file’s directory.
`kpi_type`	`"revenue"` \| `"non-revenue"`	`"revenue"`	KPI type for Meridian’s data loader.
`coord_to_columns`	`dict[str, Any]`	required	Maps Meridian coordinate names to CSV column names. Must include `time`.
`media_to_channel`	`dict[str, str]` \| `null`	`null`	Optional media-to-channel mapping override.
`media_spend_to_channel`	`dict[str, str]` \| `null`	`null`	Optional media-spend-to-channel mapping override.
`reach_to_channel`	`dict[str, str]` \| `null`	`null`	Optional reach-to-channel mapping override.
`frequency_to_channel`	`dict[str, str]` \| `null`	`null`	Optional frequency-to-channel mapping override.
`rf_spend_to_channel`	`dict[str, str]` \| `null`	`null`	Optional RF-spend-to-channel mapping override.
`organic_reach_to_channel`	`dict[str, str]` \| `null`	`null`	Optional organic-reach-to-channel mapping override.
`organic_frequency_to_channel`	`dict[str, str]` \| `null`	`null`	Optional organic-frequency-to-channel mapping override.

`model_spec`

Field	Type	Default	Description
`kwargs`	`dict[str, Any]`	`{}`	Keyword arguments forwarded directly to Meridian `ModelSpec(**kwargs)`.
`priors`	`PriorsConfig` \| `null`	`null`	YAML-driven media prior configuration.

Supported kwargs keys include any argument accepted by Meridian’s ModelSpec constructor: max_lag, media_prior_type, holdout_id, etc. If holdout_id is present, the run is treated as an authored-holdout validation run.

Array-valued keys (holdout_id, control_population_scaling_id, non_media_population_scaling_id, rf_roi_calibration_period, roi_calibration_period) must be rectangular YAML lists of booleans. Scalars, ragged lists, string booleans such as "true", and numeric stand-ins such as 1/0 are rejected by the wrapper before Meridian model construction. Valid values are converted to NumPy arrays at runtime.

Authored holdout_id arrays are shape-checked after input data loading: national data expects (n_times,), and geo data expects (n_geos, n_times). Authored holdout_id cannot be combined with a runtime PipelineRunConfig.validation_spec.

Other wrapper-owned array kwargs are also rank- and shape-checked before model construction: roi_calibration_period expects (n_media_times, n_media_channels), rf_roi_calibration_period expects (n_media_times, n_rf_channels), control_population_scaling_id expects (n_controls,), and non_media_population_scaling_id expects (n_non_media_channels,). These kwargs require their corresponding data.coord_to_columns families to be authored.

`model_spec.priors`

Optional. If omitted, Meridian’s default PriorDistribution is used.

This focused YAML surface supports media prior parameters roi_m, mroi_m, and alpha_m.

Each prior parameter can be specified in scalar form:

model_spec:
  priors:
    roi_m:
      distribution: LogNormal
      loc: 0.2
      scale: 0.9

Or as a channel prior with a default distribution and per-channel overrides:

model_spec:
  priors:
    roi_m:
      default:
        distribution: LogNormal
        loc: 0.2
        scale: 0.9
      channels:
        paid_search:
          distribution: TruncatedNormal
          loc: 3.0
          scale: 1.5
          low: 0.0
          high: 6.0

Channel names in channels must match data.media_to_channel values. If no mapping is provided, they must match the raw media column names.

model_spec.priors and model_spec.kwargs.prior are mutually exclusive.

`DistributionSpec`

Field	Type	Description
`distribution`	`"Normal"` \| `"LogNormal"` \| `"TruncatedNormal"` \| `"Beta"`	Required distribution type.
`loc`	`float`	Required for `Normal`, `LogNormal`, and `TruncatedNormal`.
`scale`	`float`	Required and positive for `Normal`, `LogNormal`, and `TruncatedNormal`.
`low`	`float`	Required for `TruncatedNormal`.
`high`	`float`	Required for `TruncatedNormal`; must be greater than `low`.
`concentration0`	`float`	Required and positive for `Beta`.
`concentration1`	`float`	Required and positive for `Beta`.

`fit`

Field	Type	Default	Constraint	Description
`sample_prior_draws`	`PositiveInt` \| `null`	`null`	`>0` if set	Number of prior predictive draws. `null` skips prior sampling.
`n_chains`	`PositiveInt` \| `list[PositiveInt]`	`4`	`>0`	Number of MCMC chains.
`n_adapt`	`PositiveInt`	`500`	`>0`	Adaptation steps per chain.
`n_burnin`	`PositiveInt`	`500`	`>0`	Burn-in steps per chain.
`n_keep`	`PositiveInt`	`1000`	`>0`	Posterior samples to retain per chain.
`seed`	`int` \| `list[int]` \| `null`	`null`	—	RNG seed for reproducibility.
`max_tree_depth`	`PositiveInt`	`10`	`>0`	NUTS maximum tree depth.
`max_energy_diff`	`float`	`500.0`	—	NUTS maximum energy difference.
`unrolled_leapfrog_steps`	`PositiveInt`	`1`	`>0`	NUTS unrolled leapfrog steps.
`parallel_iterations`	`PositiveInt`	`10`	`>0`	TensorFlow parallel iterations.

`validation`

Field	Type	Default	Constraint	Description
`strategy`	`"none"` \| `"blocked_tail"` \| `"rolling_origin"`	`"none"`	—	Validation strategy.
`holdout_size`	`PositiveInt` \| `null`	`null`	Required for `blocked_tail`	Number of tail time periods to hold out.
`initial_train_size`	`PositiveInt` \| `null`	`null`	Required for `rolling_origin`	Initial training window size.
`test_size`	`PositiveInt` \| `null`	`null`	Required for `rolling_origin`	Test window size per split.
`step_size`	`PositiveInt` \| `null`	`null`	Must equal `test_size`	Step between rolling splits. Defaults to `test_size`.
`max_splits`	`PositiveInt` \| `null`	`null`	`>=2` if set	Maximum number of rolling splits.

Cross-field validation rules

strategy: none rejects all holdout and rolling-origin parameters.
strategy: blocked_tail requires holdout_size, rejects rolling-origin parameters.
strategy: rolling_origin requires initial_train_size and test_size, rejects holdout_size.
holdout_size without an explicit strategy is rejected (legacy shorthand removed).
Rolling-origin parameters without strategy: rolling_origin are rejected.

`exports`

Field	Type	Default	Description
`use_kpi`	`bool`	`false`	Use KPI-based metrics in Meridian analysis surfaces.
`batch_size`	`PositiveInt`	`1000`	Batch size for Meridian `Analyzer` computations.
`export_predictive_accuracy`	`bool`	`true`	Write `predictive_accuracy.csv`.
`export_review_summary`	`bool`	`true`	Write `review_summary.json`.
`export_model_selection`	`bool`	`true`	Write LOO/WAIC outputs (when compatible).
`export_plots`	`bool`	`true`	Write PNG plot artefacts in each stage.

`response_curves`

Optional section. If omitted or null, the response curves stage is skipped.

Field	Type	Default	Constraint	Description
`spend_multipliers`	`list[float]`	required	Non-empty, all `>=0`	Spend multiplier grid for response curve computation.
`use_posterior`	`bool`	`true`	—	Use posterior (vs prior) for response curves.
`by_reach`	`bool`	`true`	—	Compute reach-based response curves.
`use_optimal_frequency`	`bool`	`false`	—	Use optimal frequency in computation.
`confidence_level`	`float`	`0.9`	`0 < x < 1`	Confidence level for credible intervals.

`optimisation`

Optional section. If omitted or null, the optimisation stage is skipped.

Field	Type	Default	Constraint	Description
`start_date`	`str`	required	ISO `YYYY-MM-DD`	Start of the optimisation window.
`end_date`	`str`	required	ISO `YYYY-MM-DD`, `>= start_date`	End of the optimisation window.
`budget`	`OptimisationBudgetConfig`	required	—	Budget specification (see below).
`use_posterior`	`bool`	`true`	—	Use posterior (vs prior) for optimisation.
`use_optimal_frequency`	`bool`	`true`	—	Use optimal frequency in optimisation.
`confidence_level`	`float`	`0.9`	`0 < x < 1`	Confidence level for credible intervals.

`optimisation.budget`

Field	Type	Default	Constraint	Description
`mode`	`"fixed_total"` \| `"relative_reference_window_total"`	required	—	Budget mode.
`value`	`PositiveFloat`	required	`>0`	Budget value. Absolute for `fixed_total`, multiplier for `relative_reference_window_total`.

When mode: relative_reference_window_total, the effective budget is value × total_spend_in_reference_window. The reference window is defined by start_date and end_date.

Manifest schema reference

The run_manifest.json file is the source of truth for every meridian-tools run. It lives at the root of the run directory and records identity, timing, versions, overall status, top-level artefact index, and per-stage records.

Current version

The current manifest version is 4. Versions 0, 1, 2, and 3 are supported for backward compatibility when loading older run directories.

Top-level fields

Field	Type	Description
`manifest_version`	`int`	Schema version (0, 1, 2, 3, or 4).
`run_name`	`str`	Human-readable run name.
`config_path`	`str`	Path to the source YAML used for this run. For refresh runs this points to the source run’s archived `config.source.yaml`.
`output_dir`	`str`	Path to the run directory.
`started_at`	`str`	UTC ISO-8601 timestamp when the run began.
`status`	`str`	Overall run status: `"running"`, `"completed"`, or `"failed"`.
`finished_at`	`str` \| `null`	UTC ISO-8601 timestamp when the run finished. `null` while running.
`meridian_tools_version`	`str`	Version of `meridian-tools` that produced the run.
`meridian_version`	`str` \| `null`	Version of Google Meridian used. `null` if not yet recorded.
`artifacts`	`dict[str, str]`	Top-level artefact index. Key artefacts from stages are promoted here for quick lookup.
`stages`	`list[StageRecord]`	Ordered list of pipeline stage records (including skipped and failed stages).

Top-level `artifacts` index

The runner promotes key artefacts into the top-level artifacts dictionary after each stage completes. Promoted artefact names include:

config_source, config_resolved, input_data_provenance (from 00_run_metadata)
validation_spec (from 10_validation)
meridian_model (from 20_model_fit)
diagnostics_bundle, model_results_summary (from 30_model_assessment)
summary_metrics_csv, summary_metrics_nc (from 40_decomposition)

This index provides flat access to important artefacts without walking the stages array.

`StageRecord` fields

Each entry in the stages array represents one pipeline stage. Stages can have any of four statuses: "running", "completed", "skipped", or "failed".

Field	Type	Description
`name`	`str`	Stage identifier (for example, `"00_run_metadata"`, `"20_model_fit"`).
`status`	`str`	Stage status: `"running"`, `"completed"`, `"skipped"`, or `"failed"`.
`started_at`	`str` \| `null`	UTC ISO-8601 timestamp when the stage began.
`finished_at`	`str` \| `null`	UTC ISO-8601 timestamp when the stage finished.
`elapsed_seconds`	`float` \| `null`	Wall-clock seconds for stage execution.
`message`	`str` \| `null`	Human-readable message. Present for skipped stages (reason) and failed stages (error).
`artifacts`	`dict[str, str]`	Map of artefact names to relative file paths. Empty for skipped stages.

Artefact path convention

All artefact paths in manifests written by run_pipeline are relative to the run directory and must resolve to existing regular files inside that directory. Producers reject absolute paths, lexical .. components, paths that resolve outside the run directory, missing paths, directories, and special files. Internal symlinks are accepted only when they resolve to regular files inside the run directory.

This makes run directories portable across machines and file systems. When you load a run record through load_run_record, the lifecycle layer resolves relative paths to absolute paths against the run directory.

Example stage record:

{
  "name": "30_model_assessment",
  "status": "completed",
  "started_at": "2026-04-02T07:40:30+00:00",
  "finished_at": "2026-04-02T07:41:00+00:00",
  "elapsed_seconds": 30.1,
  "message": null,
  "artifacts": {
    "diagnostics_bundle": "30_model_assessment/diagnostics_bundle.json",
    "review_summary": "30_model_assessment/review_summary.json",
    "model_results_summary": "30_model_assessment/model_results_summary.html"
  }
}

Stage names and ordering

All seven stages are always recorded in execution order. Stages that do not apply to a given run are recorded with status: "skipped".

Stage name	Number	Skippable	Description
`00_run_metadata`	00	No	Config archival and input-data provenance capture.
`10_validation`	10	Yes	Validation spec (skipped when no validation applies).
`20_model_fit`	20	No	Meridian model fitting.
`30_model_assessment`	30	No	Diagnostics, model selection.
`40_decomposition`	40	No	Media decomposition metrics.
`60_response_curves`	60	Yes	Response curves (skipped when the config section is absent).
`70_optimisation`	70	Yes	Budget optimisation (skipped when the config section is absent).

The numbering gap at 50 is intentional, reserving space for future stages.

Required artefacts

The lifecycle loader requires the following top-level artefacts to be present in the manifest for a run to be loadable:

config_source (promoted from 00_run_metadata)
config_resolved (promoted from 00_run_metadata)
input_data_provenance (promoted from 00_run_metadata) for manifest version 3 and 4 runs

These are enforced by _require_manifest_artifact in load_run_record. If a required entry is missing, a LifecycleError is raised.

At run completion time, new manifest version 4 runs also validate the full REQUIRED_MANIFEST_ARTIFACTS set: config_source, config_resolved, input_data_provenance, and diagnostics_bundle. The lifecycle loader still treats diagnostics_bundle as optional so older or partial runs can be loaded without it; if it is absent, RunRecord.diagnostics_bundle_path is None.

Input-data provenance payload

Manifest version 3 introduced 00_run_metadata/input_data_provenance.json. This file records the pinned Phase 09 input-data contract:

provenance_version
authored_path
resolved_path
sha256
size_bytes
mtime_utc
row_count
column_count
ordered_columns

The lifecycle compare surface uses these fields to distinguish real dataset changes from older runs whose manifests predate provenance capture.

Example manifest

{
  "manifest_version": 4,
  "run_name": "my-project_blocked_tail",
  "config_path": "/workspace/configs/project.yml",
  "output_dir": "/workspace/runs/my-project_blocked_tail_20260402_073500",
  "started_at": "2026-04-02T07:35:00+00:00",
  "status": "completed",
  "finished_at": "2026-04-02T07:42:15+00:00",
  "meridian_tools_version": "0.4.0",
  "meridian_version": "1.5.3",
  "artifacts": {
    "config_source": "00_run_metadata/config.source.yaml",
    "config_resolved": "00_run_metadata/config.resolved.yaml",
    "input_data_provenance": "00_run_metadata/input_data_provenance.json",
    "validation_spec": "10_validation/validation_spec.json",
    "meridian_model": "20_model_fit/meridian_model.binpb",
    "diagnostics_bundle": "30_model_assessment/diagnostics_bundle.json",
    "model_results_summary": "30_model_assessment/model_results_summary.html",
    "summary_metrics_csv": "40_decomposition/summary_metrics.csv",
    "summary_metrics_nc": "40_decomposition/summary_metrics.nc"
  },
  "stages": [
    {
      "name": "00_run_metadata",
      "status": "completed",
      "started_at": "2026-04-02T07:35:00+00:00",
      "finished_at": "2026-04-02T07:35:01+00:00",
      "elapsed_seconds": 0.5,
      "message": null,
      "artifacts": {
        "config_source": "00_run_metadata/config.source.yaml",
        "config_resolved": "00_run_metadata/config.resolved.yaml",
        "input_data_provenance": "00_run_metadata/input_data_provenance.json"
      }
    },
    {
      "name": "10_validation",
      "status": "completed",
      "started_at": "2026-04-02T07:35:01+00:00",
      "finished_at": "2026-04-02T07:35:01+00:00",
      "elapsed_seconds": 0.1,
      "message": null,
      "artifacts": {
        "validation_spec": "10_validation/validation_spec.json"
      }
    },
    {
      "name": "20_model_fit",
      "status": "completed",
      "started_at": "2026-04-02T07:35:01+00:00",
      "finished_at": "2026-04-02T07:40:30+00:00",
      "elapsed_seconds": 329.0,
      "message": null,
      "artifacts": {
        "meridian_model": "20_model_fit/meridian_model.binpb",
        "fit_metadata": "20_model_fit/fit_metadata.json",
        "prior_distributions": "20_model_fit/prior_distributions.json"
      }
    },
    {
      "name": "30_model_assessment",
      "status": "completed",
      "started_at": "2026-04-02T07:40:30+00:00",
      "finished_at": "2026-04-02T07:41:00+00:00",
      "elapsed_seconds": 30.1,
      "message": null,
      "artifacts": {
        "diagnostics_bundle": "30_model_assessment/diagnostics_bundle.json",
        "review_summary": "30_model_assessment/review_summary.json",
        "model_results_summary": "30_model_assessment/model_results_summary.html",
        "model_selection_status": "30_model_assessment/model_selection_status.json"
      }
    },
    {
      "name": "40_decomposition",
      "status": "completed",
      "started_at": "2026-04-02T07:41:00+00:00",
      "finished_at": "2026-04-02T07:42:00+00:00",
      "elapsed_seconds": 60.0,
      "message": null,
      "artifacts": {
        "summary_metrics_nc": "40_decomposition/summary_metrics.nc",
        "summary_metrics_csv": "40_decomposition/summary_metrics.csv"
      }
    },
    {
      "name": "60_response_curves",
      "status": "skipped",
      "started_at": "2026-04-02T07:42:00+00:00",
      "finished_at": "2026-04-02T07:42:00+00:00",
      "elapsed_seconds": 0.0,
      "message": "No `response_curves` section was authored in the YAML config.",
      "artifacts": {}
    },
    {
      "name": "70_optimisation",
      "status": "skipped",
      "started_at": "2026-04-02T07:42:00+00:00",
      "finished_at": "2026-04-02T07:42:00+00:00",
      "elapsed_seconds": 0.0,
      "message": "No `optimisation` section was authored in the YAML config.",
      "artifacts": {}
    }
  ]
}

Version history

Version 4 (current)

Added fail-closed artefact integrity checks for completed runs. New runs must promote and validate config_source, config_resolved, input_data_provenance, and diagnostics_bundle before completion. Manifest artefact paths must resolve to regular files inside the run directory.

Version 3

Added input_data_provenance.json and made provenance available to lifecycle loading and compare surfaces.

Version 2

Added export_plots support, top-level artifacts index, status field, config_path, output_dir, and per-stage status, elapsed_seconds, and message fields.

Version 1

Added meridian_version field and response_curves / optimisation stages.

Version 0

Initial manifest schema with core stages and artefact tracking.

All four versions are supported by RunManifest.from_dict. Missing fields in older versions are filled with defaults.

Output schema reference

This page documents the complete run directory layout produced by meridian-tools. Every successful pipeline run creates a timestamped directory containing the artefacts described below.

Run directory structure

<run_name>_<YYYYMMDD_HHMMSS>/
│
├── run_manifest.json                        # Source of truth for the run
│
├── 00_run_metadata/
│   ├── config.source.yaml                   # Verbatim copy of the authored YAML
│   ├── config.resolved.yaml                 # YAML after path resolution
│   └── input_data_provenance.json           # Pinned source/resolution/hash metadata
│
├── 10_validation/                           # Only for validation-aware runs
│   └── validation_spec.json                 # Validation provenance record
│
├── 20_model_fit/
│   ├── meridian_model.binpb                 # Serialised Meridian model
│   └── fit_metadata.json                    # Fit settings and Meridian version
│
├── 30_model_assessment/
│   ├── diagnostics_bundle.json              # Diagnostics export manifest
│   ├── predictive_accuracy.csv              # Per-observation accuracy metrics
│   ├── review_summary.json                  # Meridian review battery results
│   ├── model_results_summary.html           # Meridian HTML summary report
│   ├── plots/                               # When export_plots: true
│   │   ├── model_fit.png
│   │   └── rhat_boxplot.png
│   │
│   │  # Model selection outputs (compatible runs):
│   ├── loo_summary.json                     # LOO summary statistics
│   ├── waic_summary.json                    # WAIC summary statistics
│   ├── loo_pointwise.csv                    # Per-observation LOO + Pareto k
│   ├── waic_pointwise.csv                   # Per-observation WAIC
│   ├── model_comparison.csv                 # Ranked comparison table
│   └── model_selection_warnings.json        # Warning details, when emitted
│   │
│   │  # Model selection status (unavailable/degraded runs):
│   └── model_selection_status.json          # Reason code and exception details
│
├── 40_decomposition/
│   ├── summary_metrics.nc                   # NetCDF decomposition dataset
│   ├── summary_metrics.csv                  # Tabular decomposition
│   └── plots/                               # When export_plots: true
│       ├── channel_contribution_area_chart.png
│       ├── contribution_waterfall_chart.png
│       ├── spend_vs_contribution_chart.png
│       └── roi_bar_chart.png
│
├── 60_response_curves/                      # Only when response_curves configured
│   ├── response_curves.nc                   # NetCDF response curve dataset
│   ├── response_curves.csv                  # Tabular response curves
│   └── plots/                               # When export_plots: true
│       └── response_curves_plot.png
│
└── 70_optimisation/                         # Only when optimisation configured
    ├── optimisation_summary.html            # Meridian optimisation HTML report
    ├── optimised_data.nc                    # Optimised allocation (NetCDF)
    ├── optimised_data.csv                   # Optimised allocation (CSV)
    ├── nonoptimised_data.nc                 # Baseline allocation (NetCDF)
    ├── nonoptimised_data.csv                # Baseline allocation (CSV)
    ├── optimisation_grid.csv                # Full optimisation grid
    └── plots/                               # When export_plots: true
        ├── incremental_outcome_delta_plot.png
        ├── budget_allocation_optimised_plot.png
        ├── budget_allocation_nonoptimised_plot.png
        ├── spend_delta_plot.png
        └── optimisation_response_curves_plot.png

Stage details

`00_run_metadata`

Always present. Created first.

Artefact	Format	Description
`config.source.yaml`	YAML	Verbatim copy of the source config for this run. On refresh, this is copied from the source run’s archived `config.source.yaml`.
`config.resolved.yaml`	YAML	Config after relative path resolution. Does not include runtime-only fields (`output_dir`, `run_name`).
`input_data_provenance.json`	JSON	Pinned input-data provenance: authored path, resolved path, SHA-256, file size, mtime, row count, column count, and ordered columns.

`10_validation`

Present only for validation-aware runs (blocked tail, rolling origin, authored holdout, or final fit with validation provenance).

Artefact	Format	Description
`validation_spec.json`	JSON	Full validation provenance. See validation-spec-schema.md.

`20_model_fit`

Always present.

Artefact	Format	Description
`meridian_model.binpb`	Protocol Buffers	Serialised Meridian model (requires `google-meridian[schema]`).
`fit_metadata.json`	JSON	Records `FitConfig` values and Meridian version.
`prior_distributions.json`	JSON	Applied Meridian prior distributions after construction and broadcasting.

`30_model_assessment`

Always present. Content varies by compatibility.

Artefact	Format	Condition	Description
`diagnostics_bundle.json`	JSON	Always	Diagnostics export manifest with status of each sub-export.
`predictive_accuracy.csv`	CSV	`export_predictive_accuracy: true`	Predictive accuracy per observation.
`review_summary.json`	JSON	`export_review_summary: true`	Meridian review battery results.
`model_results_summary.html`	HTML	Always	Meridian HTML model summary.
`plots/model_fit.png`	PNG	`export_plots: true`	Model fit visualisation.
`plots/rhat_boxplot.png`	PNG	`export_plots: true`	R-hat convergence diagnostic boxplot.
`loo_summary.json`	JSON	Compatible + `export_model_selection: true`	LOO summary.
`waic_summary.json`	JSON	Compatible + `export_model_selection: true`	WAIC summary.
`loo_pointwise.csv`	CSV	Compatible + `export_model_selection: true`	Per-observation LOO values.
`waic_pointwise.csv`	CSV	Compatible + `export_model_selection: true`	Per-observation WAIC values.
`model_comparison.csv`	CSV	Compatible + `export_model_selection: true`	Ranked model comparison.
`model_selection_warnings.json`	JSON	Compatible with warnings + `export_model_selection: true`	Captured warning category/message/step and result flags.
`model_selection_status.json`	JSON	Unavailable/degraded + `export_model_selection: true`	Reason code, warning details, and optional exception type.

`40_decomposition`

Always present.

Artefact	Format	Description
`summary_metrics.nc`	NetCDF	Full decomposition dataset with coordinates.
`summary_metrics.csv`	CSV	Flattened tabular decomposition.
`plots/channel_contribution_area_chart.png`	PNG	Channel contribution over time.
`plots/contribution_waterfall_chart.png`	PNG	Contribution waterfall breakdown.
`plots/spend_vs_contribution_chart.png`	PNG	Spend vs. contribution scatter.
`plots/roi_bar_chart.png`	PNG	ROI by channel bar chart.

`60_response_curves`

Present only when the response_curves YAML section is configured.

Artefact	Format	Description
`response_curves.nc`	NetCDF	Response curve dataset across spend multipliers.
`response_curves.csv`	CSV	Flattened tabular response curves.
`plots/response_curves_plot.png`	PNG	Response curve visualisation.

`70_optimisation`

Present only when the optimisation YAML section is configured.

Artefact	Format	Description
`optimisation_summary.html`	HTML	Meridian optimisation summary report.
`optimised_data.nc`	NetCDF	Optimised budget allocation.
`optimised_data.csv`	CSV	Tabular optimised allocation.
`nonoptimised_data.nc`	NetCDF	Baseline (non-optimised) allocation.
`nonoptimised_data.csv`	CSV	Tabular baseline allocation.
`optimisation_grid.csv`	CSV	Full optimisation grid dataset.
`plots/incremental_outcome_delta_plot.png`	PNG	Incremental outcome delta.
`plots/budget_allocation_optimised_plot.png`	PNG	Optimised allocation chart.
`plots/budget_allocation_nonoptimised_plot.png`	PNG	Baseline allocation chart.
`plots/spend_delta_plot.png`	PNG	Spend delta between optimised and baseline.
`plots/optimisation_response_curves_plot.png`	PNG	Optimisation response curves.

Reading order for analysts

For a quick assessment of a completed run:

run_manifest.json — run identity, timing, stage completion
00_run_metadata/config.source.yaml — what was authored
00_run_metadata/input_data_provenance.json — dataset identity and shape
30_model_assessment/diagnostics_bundle.json — diagnostics export state
30_model_assessment/model_results_summary.html — visual model summary
40_decomposition/summary_metrics.csv — easiest tabular output to inspect

For model selection:

30_model_assessment/loo_summary.json or model_selection_status.json

For scenario analysis:

60_response_curves/response_curves.csv
70_optimisation/optimisation_summary.html

Validation spec schema reference

The validation_spec.json artefact is written to 10_validation/ for every validation-aware pipeline run. It records the concrete validation provenance for that specific run, including the holdout strategy, split geometry, and date windows. Current runs write validation-spec version 2, which also binds the source coordinate fingerprints and the execution prefix used by the split.

Fields

Field	Type	Description
`mode`	`"validation"` \| `"final_fit"`	Whether this is a validation split or the final production fit.
`strategy`	`"none"` \| `"blocked_tail"` \| `"rolling_origin"` \| `"authored_holdout"`	Validation strategy that produced this run.
`split_label`	`str`	Human-readable identifier for the split (e.g. `"blocked_tail"`, `"split_01"`, `"final_fit"`).
`holdout_source`	`"generated_validation"` \| `"authored_model_spec"` \| `"none"`	How the holdout mask was produced.
`generated_holdout`	`bool`	Whether the holdout mask was auto-generated by `meridian-tools`.
`run_name_suffix`	`str`	Suffix appended to the run name for this split.
`holdout_shape`	`list[int]` \| `null`	Shape of the holdout mask array. `null` for final-fit runs.
`train_indices`	`list[int]`	Integer indices into the time axis used for training.
`test_indices`	`list[int]`	Integer indices into the time axis used for testing. Empty for final-fit runs.
`train_dates`	`list[str]`	Date values corresponding to `train_indices`.
`test_dates`	`list[str]`	Date values corresponding to `test_indices`. Empty for final-fit runs.
`validation_spec_version`	`int`	Version of the validation-spec schema. Current value is `2`.
`data_binding`	`object` \| `null`	Version-2 coordinate binding. Present for generated validation and final-fit runs, `null` for authored-holdout runs.

Version-2 `data_binding`

data_binding records the source coordinate fingerprints and the execution prefix used by a generated split. Rolling-origin validation fits only through the end of each split’s test window, not through future observations. The binding lets refresh execution reject source data whose coordinates no longer match the stored split.

Field	Type	Description
`coordinate_canonicalization_version`	`int`	Coordinate string canonicalisation version. Current value is `1`.
`source_time_count`	`int`	Number of time coordinates in the source dataset.
`source_time_sha256`	`str`	SHA-256 digest of all source time coordinates.
`execution_time_count`	`int`	Number of time coordinates visible to this fit.
`execution_time_sha256`	`str`	SHA-256 digest of the execution prefix.
`execution_end_date`	`str`	Last date in the execution prefix.
`geo_mode`	`"national"` \| `"geo"`	Whether the model is national or geo-indexed.
`geo_count`	`int`	Number of geos, or `0` for national data.
`geo_sha256`	`str` \| `null`	SHA-256 digest of geo coordinates, or `null` for national data.

Mode and strategy combinations

Mode	Strategy	Holdout source	Description
`validation`	`blocked_tail`	`generated_validation`	Auto-generated contiguous tail holdout.
`validation`	`rolling_origin`	`generated_validation`	One split from an expanding-window plan.
`validation`	`authored_holdout`	`authored_model_spec`	User-provided holdout mask from YAML.
`final_fit`	`none`	`none`	Full-sample production fit after validation.

Invariants

Validation-mode specs always have a non-null holdout_shape.
Final-fit specs always have holdout_shape: null, empty test_indices, and empty test_dates.
train_indices and train_dates always have matching lengths.
test_indices and test_dates always have matching lengths.
Authored-holdout specs have empty train_indices, test_indices, train_dates, and test_dates.
Version-2 generated validation and final-fit specs have non-null data_binding.
Rolling-origin split specs set execution_time_count to the end of that split’s test window. Later source observations are not visible to that fit.

Example: blocked tail validation

{
  "mode": "validation",
  "strategy": "blocked_tail",
  "split_label": "blocked_tail",
  "holdout_source": "generated_validation",
  "generated_holdout": true,
  "run_name_suffix": "blocked_tail",
  "holdout_shape": [10],
  "train_indices": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
  "test_indices": [12, 13, 14, 15, 16, 17, 18, 19],
  "train_dates": ["2024-01-01", "2024-01-08", "..."],
  "test_dates": ["2024-03-25", "2024-04-01", "..."],
  "validation_spec_version": 2,
  "data_binding": {
    "coordinate_canonicalization_version": 1,
    "source_time_count": 20,
    "source_time_sha256": "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
    "execution_time_count": 20,
    "execution_time_sha256": "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
    "execution_end_date": "2024-05-13",
    "geo_mode": "national",
    "geo_count": 0,
    "geo_sha256": null
  }
}

Example: rolling origin split

{
  "mode": "validation",
  "strategy": "rolling_origin",
  "split_label": "split_01",
  "holdout_source": "generated_validation",
  "generated_holdout": true,
  "run_name_suffix": "split_01",
  "holdout_shape": [56],
  "train_indices": [0, 1, 2, "...", 51],
  "test_indices": [52, 53, 54, 55],
  "train_dates": ["2024-01-01", "..."],
  "test_dates": ["2024-12-30", "2025-01-06", "2025-01-13", "2025-01-20"],
  "validation_spec_version": 2,
  "data_binding": {
    "coordinate_canonicalization_version": 1,
    "source_time_count": 60,
    "source_time_sha256": "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb",
    "execution_time_count": 56,
    "execution_time_sha256": "cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc",
    "execution_end_date": "2025-01-20",
    "geo_mode": "national",
    "geo_count": 0,
    "geo_sha256": null
  }
}

Example: final fit

{
  "mode": "final_fit",
  "strategy": "none",
  "split_label": "final_fit",
  "holdout_source": "none",
  "generated_holdout": false,
  "run_name_suffix": "final_fit",
  "holdout_shape": null,
  "train_indices": [0, 1, 2, "...", 59],
  "test_indices": [],
  "train_dates": ["2024-01-01", "...", "2025-02-24"],
  "test_dates": [],
  "validation_spec_version": 2,
  "data_binding": {
    "coordinate_canonicalization_version": 1,
    "source_time_count": 60,
    "source_time_sha256": "dddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd",
    "execution_time_count": 60,
    "execution_time_sha256": "dddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd",
    "execution_end_date": "2025-02-24",
    "geo_mode": "national",
    "geo_count": 0,
    "geo_sha256": null
  }
}

Note on holdout mask storage

The actual holdout mask array (boolean NumPy array) is not stored in validation_spec.json because it can be large for geo-panel models (n_geos × n_times). Only its holdout_shape is recorded. The mask is injected into the Meridian model at runtime and can be reconstructed from train_indices, test_indices, and the data geometry.

Python API

Public Python APIs exposed by meridian-tools.

meridian_tools.config

Configuration models and YAML loading for meridian-tools.

Module: meridian_tools.config

Functions

`load_yaml_config`

def load_yaml_config(path: str | Path) -> MeridianToolsConfig

Load and validate a meridian-tools YAML file.

Parameters:

path — Path to the YAML configuration file.

Returns: A validated MeridianToolsConfig instance.

Raises: pydantic.ValidationError if the YAML content does not match the schema.

Example:

from meridian_tools.config import load_yaml_config

config = load_yaml_config("project.yml")
print(config.project.name)
print(config.data.path)
print(config.validation.strategy)

Classes

`MeridianToolsConfig`

class MeridianToolsConfig(BaseModel)

Full YAML configuration for one meridian-tools run. This is the top-level model returned by load_yaml_config.

Attribute	Type	Default
`project`	`ProjectConfig`	`ProjectConfig()`
`data`	`CsvDataConfig`	required
`model_spec`	`ModelSpecConfig`	`ModelSpecConfig()`
`fit`	`FitConfig`	`FitConfig()`
`validation`	`ValidationConfig`	`ValidationConfig()`
`exports`	`ExportsConfig`	`ExportsConfig()`
`response_curves`	`ResponseCurvesConfig	None`
`optimisation`	`OptimisationConfig	None`

`PipelineRunConfig`

@dataclass(frozen=True)
class PipelineRunConfig

Runtime options that sit outside the YAML file. Passed to run_pipeline.

Attribute	Type	Default	Description
`config_path`	`Path`	required	Path to the YAML config file.
`output_dir`	`Path`	`Path("runs")`	Directory for run output.
`run_name`	`str	None`	`None`
`validation_spec`	`ValidationRunSpec	None`	`None`
`apply_run_name_suffix`	`bool`	`True`	Whether to append validation-aware suffixes to the run name.
`source_config_path`	`Path	None`	`None`

`ProjectConfig`

class ProjectConfig(BaseModel)

Attribute	Type	Default
`name`	`str`	`"meridian-project"`

`CsvDataConfig`

class CsvDataConfig(BaseModel)

CSV loader configuration compatible with Meridian’s CsvDataLoader.

Attribute	Type	Default
`path`	`Path`	required
`kpi_type`	`Literal["revenue", "non-revenue"]`	`"revenue"`
`coord_to_columns`	`dict[str, Any]`	required
`media_to_channel`	`dict[str, str]	None`
`media_spend_to_channel`	`dict[str, str]	None`
`reach_to_channel`	`dict[str, str]	None`
`frequency_to_channel`	`dict[str, str]	None`
`rf_spend_to_channel`	`dict[str, str]	None`
`organic_reach_to_channel`	`dict[str, str]	None`
`organic_frequency_to_channel`	`dict[str, str]	None`

`ModelSpecConfig`

class ModelSpecConfig(BaseModel)

Attribute	Type	Default
`kwargs`	`dict[str, Any]`	`{}`
`priors`	`PriorsConfig	None`

`DistributionSpec`

class DistributionSpec(BaseModel)

One YAML-authored TensorFlow Probability distribution.

Supported distribution names are Normal, LogNormal, TruncatedNormal, and Beta. Required parameters depend on the distribution and extra distribution parameters are rejected.

`ChannelPriorSpec`

class ChannelPriorSpec(BaseModel)

A media prior with a default distribution and optional per-channel overrides.

Attribute	Type	Default
`default`	`DistributionSpec`	required
`channels`	`dict[str, DistributionSpec]`	`{}`

`PriorsConfig`

class PriorsConfig(BaseModel)

YAML-friendly subset of Meridian media prior distributions.

Attribute	Type	Default
`roi_m`	`DistributionSpec	ChannelPriorSpec
`mroi_m`	`DistributionSpec	ChannelPriorSpec
`alpha_m`	`DistributionSpec	ChannelPriorSpec

`FitConfig`

class FitConfig(BaseModel)

Sampling configuration for Meridian posterior fitting.

Attribute	Type	Default
`sample_prior_draws`	`PositiveInt	None`
`n_chains`	`PositiveInt	list[PositiveInt]`
`n_adapt`	`PositiveInt`	`500`
`n_burnin`	`PositiveInt`	`500`
`n_keep`	`PositiveInt`	`1000`
`seed`	`int	list[int]
`max_tree_depth`	`PositiveInt`	`10`
`max_energy_diff`	`float`	`500.0`
`unrolled_leapfrog_steps`	`PositiveInt`	`1`
`parallel_iterations`	`PositiveInt`	`10`

`ValidationConfig`

class ValidationConfig(BaseModel)

Validation and holdout orchestration settings.

Attribute	Type	Default
`strategy`	`Literal["none", "blocked_tail", "rolling_origin"]`	`"none"`
`holdout_size`	`PositiveInt	None`
`initial_train_size`	`PositiveInt	None`
`test_size`	`PositiveInt	None`
`step_size`	`PositiveInt	None`
`max_splits`	`PositiveInt	None`

See the validation guide for cross-field validation rules.

`ExportsConfig`

class ExportsConfig(BaseModel)

Attribute	Type	Default
`use_kpi`	`bool`	`False`
`batch_size`	`PositiveInt`	`1000`
`export_predictive_accuracy`	`bool`	`True`
`export_review_summary`	`bool`	`True`
`export_model_selection`	`bool`	`True`
`export_plots`	`bool`	`True`

`ResponseCurvesConfig`

class ResponseCurvesConfig(BaseModel)

Attribute	Type	Default	Constraint
`spend_multipliers`	`list[float]`	required	Non-empty, all `>= 0`
`use_posterior`	`bool`	`True`
`by_reach`	`bool`	`True`
`use_optimal_frequency`	`bool`	`False`
`confidence_level`	`float`	`0.9`	`0 < x < 1`

`OptimisationConfig`

class OptimisationConfig(BaseModel)

Attribute	Type	Default	Constraint
`start_date`	`str`	required	ISO `YYYY-MM-DD`
`end_date`	`str`	required	ISO `YYYY-MM-DD`, `>= start_date`
`budget`	`OptimisationBudgetConfig`	required
`use_posterior`	`bool`	`True`
`use_optimal_frequency`	`bool`	`True`
`confidence_level`	`float`	`0.9`	`0 < x < 1`

`OptimisationBudgetConfig`

class OptimisationBudgetConfig(BaseModel)

Attribute	Type	Default
`mode`	`Literal["fixed_total", "relative_reference_window_total"]`	required
`value`	`PositiveFloat`	required

meridian_tools.runner

Pipeline orchestration for meridian-tools.

Module: meridian_tools.runner

Functions

`run_pipeline`

def run_pipeline(
    run_config: PipelineRunConfig,
    *,
    progress_callback: Callable | None = None,
) -> PipelineRunResult

Execute the full meridian-tools staged pipeline.

The pipeline proceeds through the following stages in order:

00_run_metadata — Archive source and resolved configs and write input_data_provenance.json.
10_validation — Write validation spec (if validation-aware).
20_model_fit — Build input data, construct the Meridian model, sample prior and posterior.
30_model_assessment — Export diagnostics, model summary, and model selection outputs.
40_decomposition — Export summary metrics.
60_response_curves — Export response curves (if configured).
70_optimisation — Export optimisation results (if configured).

The manifest is written to disk after each stage, so a failure mid-pipeline leaves a readable partial manifest.

Before creating the dated run directory, the runner enforces three separate pre-run checks:

dependency preflight (google-meridian[schema], optional plot support)
validation-execution contract checks for incompatible single-run validation combinations
a narrow wrapper-owned config/data preflight over the resolved input file and authored column mapping

The wrapper-owned preflight checks exactly:

resolved data.path exists and is a regular file
the CSV header row can be read
the parsed header is non-empty
no parsed header cell is blank after trimming whitespace
every authored scalar entry in data.coord_to_columns exists in the header
every authored list member in data.coord_to_columns exists in the header
every authored key in media_to_channel, media_spend_to_channel, reach_to_channel, frequency_to_channel, rf_spend_to_channel, organic_reach_to_channel, and organic_frequency_to_channel exists in the header
authored list-valued coord families are non-empty
authored mapping fields above are non-empty
supported media/RF family groups are complete when authored

Header matching is exact and case-sensitive. Anything outside this closed matrix remains Meridian-owned validation.

Parameters:

run_config — A PipelineRunConfig specifying the execution config path, output directory, run name, optional validation spec, and optional source_config_path for metadata archival.
progress_callback — Optional callable invoked on stage lifecycle events. The callback receives keyword arguments:
- stage_name (str) — stage identifier.
- event (str) — one of "started", "completed", "skipped", or "failed".
- stage_index (int) — 1-based position in the pipeline.
- stage_count (int) — total number of stages.
- elapsed_seconds (float) — wall-clock time (present for "completed" and "failed" events).
- message (str) — human-readable detail (present for "skipped" and "failed" events).

Returns: A PipelineRunResult with the run directory and manifest path.

Raises:

RuntimeError if Meridian schema support is unavailable (checked at preflight before the run directory is created).
RuntimeError if exports.export_plots is true but vl-convert-python is not installed (also checked at preflight).
ValidationExecutionContractError if the requested single-run validation execution path is incompatible with the authored config.
ConfigPreflightError if wrapper-owned config/data preflight fails before run-directory creation.
PipelineRunFailure if any exception occurs after the dated run directory already exists.

Example:

from pathlib import Path
from meridian_tools.config import PipelineRunConfig
from meridian_tools.runner import run_pipeline

result = run_pipeline(
    PipelineRunConfig(
        config_path=Path("project.yml"),
        output_dir=Path("runs"),
    )
)

print(result.run_dir)
print(result.manifest_path)

Classes

`PipelineRunResult`

@dataclass(frozen=True)
class PipelineRunResult

Disk locations for one completed meridian-tools run.

Attribute	Type	Description
`run_dir`	`Path`	Absolute path to the run directory.
`manifest_path`	`Path`	Absolute path to `run_manifest.json`.

`ValidationExecutionContractError`

class ValidationExecutionContractError(ValueError)

Raised when the requested single-run validation execution path is incompatible with the authored config. Current examples include direct rolling_origin execution through run_pipeline(...) and combining PipelineRunConfig.validation_spec with authored model_spec.kwargs.holdout_id.

`ConfigPreflightError`

class ConfigPreflightError(ValueError)

Raised when the wrapper-owned Phase 10 preflight fails before run-directory creation. This covers only the closed wrapper preflight boundary, not full Meridian model validation.

`PipelineRunFailure`

class PipelineRunFailure(RuntimeError)

Raised when a run fails after the dated run directory already exists. The original underlying exception is preserved via __cause__.

Attribute	Type	Description
`run_dir`	`Path`	Absolute failed run directory.
`manifest_path`	`Path`	Absolute path to the failed run manifest.
`stage_name`	`str \| None`	Failing stage name when one is available.

Constants

Stage names

Constant	Value
`STAGE_RUN_METADATA`	`"00_run_metadata"`
`STAGE_VALIDATION`	`"10_validation"`
`STAGE_MODEL_FIT`	`"20_model_fit"`
`STAGE_MODEL_ASSESSMENT`	`"30_model_assessment"`
`STAGE_DECOMPOSITION`	`"40_decomposition"`
`STAGE_RESPONSE_CURVES`	`"60_response_curves"`
`STAGE_OPTIMISATION`	`"70_optimisation"`

`PIPELINE_STAGE_ORDER`

PIPELINE_STAGE_ORDER: tuple[str, ...] = (
    "00_run_metadata",
    "10_validation",
    "20_model_fit",
    "30_model_assessment",
    "40_decomposition",
    "60_response_curves",
    "70_optimisation",
)

The numbering gap at 50 is intentional, reserving space for future stages.

meridian_tools.cv

Cross-validation and holdout orchestration utilities.

Module: meridian_tools.cv

Functions

`build_last_window_holdout_mask`

def build_last_window_holdout_mask(
    time_index: Sequence[Any],
    holdout_size: int,
    geo_index: Sequence[Any] | None = None,
) -> np.ndarray

Build a blocked-tail holdout mask for Meridian’s holdout_id.

Returns a 1-D boolean mask for national data and a 2-D (n_geos, n_times) mask when geo_index is provided. The last holdout_size time periods are marked as True (held out).

Parameters:

time_index — Strictly increasing sequence of time period identifiers.
holdout_size — Number of tail periods to hold out. Must be positive and less than the length of time_index.
geo_index — Optional sequence of geo identifiers. If provided, the mask is broadcast across geos.

Returns: Boolean NumPy array.

Raises: ValueError for non-monotonic indices, undersized indices, or impossible holdout sizes.

`build_rolling_origin_splits`

def build_rolling_origin_splits(
    time_index: Sequence[Any],
    *,
    initial_train_size: int,
    test_size: int,
    step_size: int | None = None,
    max_splits: int | None = None,
) -> list[BlockedTimeSplit]

Create expanding-window blocked time splits for rolling-origin validation.

Parameters:

time_index — Strictly increasing sequence of time period identifiers.
initial_train_size — Size of the first training window.
test_size — Size of each test window.
step_size — Step between splits. Must equal test_size. Defaults to test_size.
max_splits — Maximum number of splits to generate. Must be >= 2 if set.

Returns: List of BlockedTimeSplit instances (at least 2).

Raises: ValueError for invalid parameters or if fewer than 2 splits can be generated.

`build_validation_splits`

def build_validation_splits(
    validation_config: ValidationConfig,
    time_index: Sequence[Any],
) -> list[BlockedTimeSplit]

Build deterministic split definitions from the typed validation config.

Dispatches to the appropriate split builder based on validation_config.strategy. Returns an empty list for strategy: none.

Parameters:

validation_config — A validated ValidationConfig instance.
time_index — Strictly increasing sequence of time period identifiers.

Returns: List of BlockedTimeSplit instances (empty for none).

`build_validation_plan`

def build_validation_plan(
    validation_config: ValidationConfig,
    time_index: Sequence[Any],
    geo_index: Sequence[Any] | None = None,
) -> ValidationPlan

Materialise concrete validation and final-fit run specs from one config.

For strategy: none, returns a plan with no validation runs and no final-fit run. For blocked_tail or rolling_origin, returns one ValidationRunSpec per split plus a final_fit_run spec that trains on the full time axis with no holdout.

Parameters:

validation_config — A validated ValidationConfig instance.
time_index — Strictly increasing sequence of time period identifiers.
geo_index — Optional sequence of geo identifiers for geo-panel models.

Returns: A ValidationPlan instance.

Example:

from meridian_tools.config import load_yaml_config
from meridian_tools.cv import build_validation_plan

config = load_yaml_config("project.yml")
plan = build_validation_plan(
    config.validation,
    time_index=["2024-01-01", "2024-01-08", "..."],
    geo_index=["US-CA", "US-NY"],
)

for run_spec in plan.validation_runs:
    print(run_spec.split_label, len(run_spec.train_indices), len(run_spec.test_indices))

if plan.final_fit_run:
    print("Final fit:", plan.final_fit_run.split_label)

Classes

`BlockedTimeSplit`

@dataclass(frozen=True)
class BlockedTimeSplit

One blocked time split for validation.

Attribute	Type	Description
`label`	`str`	Human-readable split label (e.g. `"blocked_tail"`, `"split_01"`).
`train_indices`	`tuple[int, ...]`	Integer indices into the time axis for training.
`test_indices`	`tuple[int, ...]`	Integer indices into the time axis for testing.
`train_dates`	`tuple[str, ...]`	Date values for training periods.
`test_dates`	`tuple[str, ...]`	Date values for test periods.

`ValidationRunSpec`

@dataclass(frozen=True)
class ValidationRunSpec

One concrete validation or final-fit run derived from a split plan. Passed to PipelineRunConfig.validation_spec to control a single pipeline execution.

Attribute	Type	Description
`mode`	`"validation"` \| `"final_fit"`	Run mode.
`strategy`	`str`	Validation strategy.
`split_label`	`str`	Human-readable split identifier.
`holdout_source`	`str`	How the holdout mask was produced.
`generated_holdout`	`bool`	Whether the holdout was auto-generated.
`holdout_id`	`np.ndarray \| None`	Concrete holdout mask (immutable).
`train_indices`	`tuple[int, ...]`	Training time indices.
`test_indices`	`tuple[int, ...]`	Test time indices.
`train_dates`	`tuple[str, ...]`	Training date values.
`test_dates`	`tuple[str, ...]`	Test date values.
`run_name_suffix`	`str`	Suffix for the run directory name.

Methods:

to_artifact_payload() — Returns the JSON-serialisable dictionary written to validation_spec.json.

`ValidationPlan`

@dataclass(frozen=True)
class ValidationPlan

Concrete validation runs and the separate final-fit run for one config.

Attribute	Type	Description
`validation_runs`	`tuple[ValidationRunSpec, ...]`	One spec per validation split.
`final_fit_run`	`ValidationRunSpec \| None`	Full-sample final-fit spec. `None` for `strategy: none`.

meridian_tools.exports

Helpers for manifest-backed Meridian export families.

Module: meridian_tools.exports

Functions

`export_model_fit_artifacts`

def export_model_fit_artifacts(
    model: Any,
    output_dir: str | Path,
    *,
    fit_config: FitConfig,
    meridian_version: str | None,
) -> dict[str, Path]

Write the stable model-fit artefact set.

Produces:

meridian_model.binpb — Serialised Meridian model (Protocol Buffers).
fit_metadata.json — Records FitConfig values and Meridian version.
prior_distributions.json — Records applied Meridian prior distributions.

Parameters:

model — Fitted Meridian model instance.
output_dir — Directory to write artefacts to.
fit_config — The FitConfig used for this run.
meridian_version — Meridian version string (or None).

Returns: Dictionary mapping artefact names to file paths.

`extract_prior_summary`

def extract_prior_summary(model: Any) -> dict[str, Any]

Return JSON-serializable applied prior distributions from a Meridian model.

The summary is based on the constructed model’s broadcast prior distribution, so it records what Meridian will use rather than only echoing YAML input.

Parameters:

model — Constructed Meridian model instance.

Returns: Dictionary keyed by Meridian prior parameter name.

`export_model_assessment_artifacts`

def export_model_assessment_artifacts(
    model: Any,
    output_dir: str | Path,
    *,
    exports_config: ExportsConfig,
    diagnostics_exporter: Callable,
    model_selection_exporter: Callable,
) -> dict[str, Path]

Write the stable assessment artefact set.

Produces diagnostics bundle, model results summary HTML, and optionally model selection outputs (LOO/WAIC) and diagnostic plots.

Parameters:

model — Fitted Meridian model instance.
output_dir — Directory to write artefacts to.
exports_config — Export switches.
diagnostics_exporter — Callable for diagnostics bundle export (typically export_diagnostics_bundle).
model_selection_exporter — Callable for model selection export.

Returns: Dictionary mapping artefact names to file paths.

`export_decomposition_artifacts`

def export_decomposition_artifacts(
    model: Any,
    output_dir: str | Path,
    *,
    exports_config: ExportsConfig,
) -> dict[str, Path]

Write the stable decomposition artefact set.

Produces:

summary_metrics.nc — NetCDF decomposition dataset.
summary_metrics.csv — Flattened tabular decomposition.
plots/ — Channel contribution, waterfall, spend vs. contribution, and ROI charts (when export_plots: true).

Parameters:

model — Fitted Meridian model instance.
output_dir — Directory to write artefacts to.
exports_config — Export switches.

Returns: Dictionary mapping artefact names to file paths.

`export_response_curve_artifacts`

def export_response_curve_artifacts(
    model: Any,
    output_dir: str | Path,
    *,
    response_curves_config: ResponseCurvesConfig,
    exports_config: ExportsConfig,
) -> dict[str, Path]

Write the stable response-curve artefact set.

Produces:

response_curves.nc — NetCDF response curve dataset.
response_curves.csv — Flattened tabular response curves.
plots/response_curves_plot.png — Response curve visualisation (when export_plots: true).

Parameters:

model — Fitted Meridian model instance.
output_dir — Directory to write artefacts to.
response_curves_config — Response curves settings from YAML.
exports_config — Export switches.

Returns: Dictionary mapping artefact names to file paths.

`export_optimisation_artifacts`

def export_optimisation_artifacts(
    model: Any,
    output_dir: str | Path,
    *,
    optimisation_config: OptimisationConfig,
    exports_config: ExportsConfig,
) -> dict[str, Path]

Write the stable optimisation artefact set.

Produces:

optimisation_summary.html — Meridian optimisation summary report.
optimised_data.nc / .csv — Optimised budget allocation.
nonoptimised_data.nc / .csv — Baseline allocation.
optimisation_grid.csv — Full optimisation grid.
plots/ — Delta, allocation, spend, and response curve charts (when export_plots: true).

For budget.mode: relative_reference_window_total, the effective budget is computed as value × total_spend_in_reference_window using the model’s media and RF spend data within the start_date–end_date window.

Parameters:

model — Fitted Meridian model instance.
output_dir — Directory to write artefacts to.
optimisation_config — Optimisation settings from YAML.
exports_config — Export switches.

Returns: Dictionary mapping artefact names to file paths.

`ensure_meridian_schema_support`

def ensure_meridian_schema_support() -> Callable

Return Meridian’s schema serialiser or raise a stable runtime error.

Checks for meridian.schema.serde.meridian_serde.save_meridian. If the import fails, raises RuntimeError with guidance to install google-meridian[schema].

Returns: The save_meridian callable.

`ensure_altair_png_support`

def ensure_altair_png_support() -> Any

Return the Altair PNG backend or raise a stable runtime error.

Checks for vl_convert. If the import fails, raises RuntimeError with guidance to install vl-convert-python.

Returns: The vl_convert module.

meridian_tools.diagnostics

Diagnostics extraction and export helpers for Meridian runs.

Module: meridian_tools.diagnostics

Functions

`predictive_accuracy_frame`

def predictive_accuracy_frame(
    meridian_model: Any,
    *,
    use_kpi: bool = False,
    selected_geos: Sequence[str] | None = None,
    selected_times: Sequence[str] | None = None,
    batch_size: int = 1000,
) -> pd.DataFrame

Return Meridian predictive accuracy as a flat DataFrame.

Uses Meridian’s Analyzer.predictive_accuracy internally and flattens the resulting xarray dataset into a pandas DataFrame.

Parameters:

meridian_model — Fitted Meridian model instance.
use_kpi — Use KPI-based metrics.
selected_geos — Optional subset of geos to evaluate.
selected_times — Optional subset of time periods to evaluate.
batch_size — Batch size for Meridian analysis.

Returns: A pandas DataFrame with one row per observation.

`review_summary_dict`

def review_summary_dict(
    meridian_model: Any,
    *,
    selected_geos: Sequence[str] | None = None,
    selected_times: Sequence[str] | None = None,
) -> dict[str, Any]

Run Meridian’s review battery and return a JSON-ready dictionary.

Uses Meridian’s ModelReviewer internally. All non-primitive values (dataclasses, enums, NumPy arrays) are recursively converted to JSON-serialisable types.

Parameters:

meridian_model — Fitted Meridian model instance.
selected_geos — Optional subset of geos.
selected_times — Optional subset of time periods.

Returns: A JSON-serialisable dictionary.

`export_diagnostics_bundle`

def export_diagnostics_bundle(
    meridian_model: Any,
    output_dir: str | Path,
    *,
    use_kpi: bool = False,
    export_predictive_accuracy: bool = True,
    export_review_summary: bool = True,
    selected_geos: Sequence[str] | None = None,
    selected_times: Sequence[str] | None = None,
    batch_size: int = 1000,
) -> dict[str, Path]

Write predictive accuracy, review summary, and bundle manifest to disk.

The bundle manifest (diagnostics_bundle.json) records the status of each sub-export ("exported" or "disabled") along with the file name and format. This provides a stable machine-readable contract for downstream consumers.

When an export is disabled, any pre-existing file from a previous run at the same path is removed to prevent stale data.

Parameters:

meridian_model — Fitted Meridian model instance.
output_dir — Directory to write artefacts to.
use_kpi — Use KPI-based metrics.
export_predictive_accuracy — Write predictive_accuracy.csv.
export_review_summary — Write review_summary.json.
selected_geos — Not supported in current scope (raises ValueError).
selected_times — Not supported in current scope (raises ValueError).
batch_size — Batch size for Meridian analysis.

Returns: Dictionary mapping artefact names to file paths. Always includes "diagnostics_bundle". Conditionally includes "predictive_accuracy" and "review_summary".

Example:

from meridian_tools.diagnostics import export_diagnostics_bundle

artifacts = export_diagnostics_bundle(
    fitted_model,
    "output/30_model_assessment",
    export_predictive_accuracy=True,
    export_review_summary=True,
)

print(artifacts["diagnostics_bundle"])
# Path("output/30_model_assessment/diagnostics_bundle.json")

meridian_tools.model_selection

Model-selection helpers layered on top of ArviZ and Meridian.

Module: meridian_tools.model_selection

Functions

`has_log_likelihood`

def has_log_likelihood(candidate: Any) -> bool

Return whether the candidate exposes a non-empty log_likelihood group.

Accepts either an ArviZ InferenceData or any object with an .inference_data attribute (e.g. a fitted Meridian model).

Parameters:

candidate — ArviZ InferenceData or fitted Meridian model.

Returns: True if a non-empty log_likelihood group exists.

`compute_loo`

def compute_loo(
    candidate: Any,
    *,
    pointwise: bool = False,
    scale: str = "log",
) -> InformationCriterionResult

Compute PSIS-LOO for a Meridian model or InferenceData.

If the candidate is a fitted Meridian model without a log_likelihood group, the function automatically reconstructs it through attach_log_likelihood.

Parameters:

candidate — Fitted Meridian model or ArviZ InferenceData with log_likelihood.
pointwise — Include per-observation LOO values and Pareto k diagnostics.
scale — Scale for ELPD computation ("log", "negative_log", or "deviance").

Returns: An InformationCriterionResult with kind="loo".

Raises: ModelSelectionError if log-likelihood cannot be obtained.

`compute_waic`

def compute_waic(
    candidate: Any,
    *,
    pointwise: bool = False,
    scale: str = "log",
) -> InformationCriterionResult

Compute WAIC for a Meridian model or InferenceData.

Same automatic log-likelihood reconstruction as compute_loo.

Parameters:

candidate — Fitted Meridian model or ArviZ InferenceData with log_likelihood.
pointwise — Include per-observation WAIC values.
scale — Scale for ELPD computation.

Returns: An InformationCriterionResult with kind="waic".

Raises: ModelSelectionError if log-likelihood cannot be obtained.

`compare_models`

def compare_models(
    candidates: Mapping[str, Any],
    *,
    ic: str = "loo",
    scale: str = "log",
) -> pd.DataFrame

Compare multiple models with ArviZ compare.

Parameters:

candidates — Dictionary mapping model names to fitted Meridian models or InferenceData objects.
ic — Information criterion to use: "loo" or "waic".
scale — Scale for ELPD computation.

Returns: A pandas DataFrame with columns: model, rank, elpd_{ic}, p_{ic}, elpd_diff, weight, se, dse, warning, scale. Ranked by ELPD (rank 0 is best).

For a single candidate, returns a one-row DataFrame with rank=0, elpd_diff=0.0, and weight=1.0.

Raises:

ValueError if ic is not "loo" or "waic", or if candidates is empty.
ModelSelectionError if any candidate lacks log-likelihood data.

Classes

`ModelSelectionError`

class ModelSelectionError(RuntimeError)

Raised when information criteria cannot be computed.

Property	Type	Description
`reason_code`	`str \| None`	Structured code identifying the failure reason.

Known reason codes:

Code	Meaning
`missing_log_likelihood_group`	`InferenceData` has no `log_likelihood` group and cannot be reconstructed.
`holdout_fit_unsupported`	Model was fitted with a holdout mask.
`requires_fitted_meridian_model`	Missing posterior samples or ArviZ `InferenceData`.
`meridian_internal_seam_incompatible`	Meridian version lacks required reconstruction methods.

`InformationCriterionResult`

@dataclass(frozen=True)
class InformationCriterionResult

Summary of one information-criterion computation.

Attribute	Type	Description
`kind`	`str`	`"loo"` or `"waic"`.
`summary`	`dict[str, Any]`	Summary statistics (ELPD, p, SE, etc.).
`pointwise`	`pd.DataFrame \| None`	Per-observation values (if `pointwise=True`).

meridian_tools.log_likelihood

Log-likelihood computation and attachment for Meridian models.

Module: meridian_tools.log_likelihood

Functions

`compute_log_likelihood_dataset`

def compute_log_likelihood_dataset(
    meridian_model: Any,
) -> xr.Dataset

Compute the pointwise log-likelihood dataset for a fitted Meridian model.

This function reconstructs the joint distribution from the posterior samples and computes observation-level log-likelihood values. It handles both geo-panel and national models.

The reconstruction recovers unsaved posterior parameters (e.g. geo deviations, tau_g_excl_baseline) that Meridian does not persist to InferenceData by default.

Parameters:

meridian_model — A fitted Meridian model with posterior samples and a compatible posterior_sampler_callable.

Returns: An xarray Dataset with a log_likelihood variable.

Raises: ModelSelectionError if the model does not expose the required internal reconstruction seams or lacks posterior samples.

`attach_log_likelihood`

def attach_log_likelihood(
    meridian_model: Any,
    *,
    in_place: bool = False,
) -> az.InferenceData

Attach a log_likelihood group to a Meridian model’s InferenceData.

If the model’s InferenceData already has a non-empty log_likelihood group, it is returned as-is (or the existing InferenceData is returned for in_place=True).

Parameters:

meridian_model — A fitted Meridian model.
in_place — If True, mutates meridian_model.inference_data directly. If False (default), returns a deep copy with the log_likelihood group attached. The original model is never modified.

Returns: An ArviZ InferenceData with a log_likelihood group.

Raises:

ModelSelectionError with reason_code="meridian_internal_seam_incompatible" if the Meridian version lacks the required private reconstruction methods.
ModelSelectionError with reason_code="requires_fitted_meridian_model" if the model has no posterior samples.
ModelSelectionError with reason_code="holdout_fit_unsupported" if the model was fitted with a holdout mask.

Example:

from meridian_tools.log_likelihood import attach_log_likelihood

# Non-mutating (default)
idata = attach_log_likelihood(fitted_model, in_place=False)
assert hasattr(idata, "log_likelihood")

# Mutating
attach_log_likelihood(fitted_model, in_place=True)
assert hasattr(fitted_model.inference_data, "log_likelihood")

Implementation notes

The reconstruction accesses three private methods on Meridian’s posterior_sampler_callable:

_get_joint_dist_unpinned
_prepare_latents_for_reconstruction
_reconstruct_posteriors

These are Meridian-internal and may change without notice. If any method is missing, a ModelSelectionError with reason_code="meridian_internal_seam_incompatible" is raised instead of crashing. See the Meridian integration notes for details on this coupling boundary.

meridian_tools.lifecycle

Post-run record management: loading, listing, comparing, and refreshing runs.

Module: meridian_tools.lifecycle

Functions

`resolve_run_directory`

def resolve_run_directory(path: str | Path) -> Path

Return the absolute resolved run directory for a run path or manifest path.

If path points to a file, it must be named run_manifest.json; the function returns its parent directory. If path is a directory, it must contain run_manifest.json.

Parameters:

path — Path to a run directory or to run_manifest.json directly.

Returns: Absolute Path to the run directory.

Raises: LifecycleError if the path does not exist, is an unexpected file, or the directory does not contain run_manifest.json.

`load_run_record`

def load_run_record(path: str | Path) -> RunRecord

Load one run directory through the versioned lifecycle contract.

Resolves the run directory, parses the manifest, and resolves artefact paths. Required artefacts (config_source, config_resolved) must be present in the manifest and exist on disk. Manifest version 3 and 4 runs must also include input_data_provenance. Optional artefacts (validation_spec, diagnostics_bundle, model_selection_status) are resolved when present and set to None when absent.

Parameters:

path — Path to a run directory or to run_manifest.json directly.

Returns: A validated RunRecord instance.

Raises: LifecycleError for missing required artefacts, malformed manifests, artefact path traversal, or claimed-but-missing artefacts.

`list_run_records`

def list_run_records(root: str | Path) -> list[RunRecord]

Discover direct child run directories under one output root.

Scans direct child directories of root for run_manifest.json files. Returns records sorted by started_at (most recent first), with directory name as a secondary sort key.

Parameters:

root — Directory to scan. Must be a directory, not a file.

Returns: List of RunRecord instances.

Raises: LifecycleError if root is not a directory or if any discovered run has an invalid manifest.

`build_refresh_run_config`

def build_refresh_run_config(
    path: str | Path,
    *,
    output_dir: str | Path | None = None,
    run_name: str | None = None,
) -> PipelineRunConfig

Build a runtime refresh config from one stored run directory.

The execution config path points to the source run’s config.resolved.yaml. The returned PipelineRunConfig.source_config_path preserves the source run’s archived config.source.yaml so the refresh can re-copy the original YAML into the new run metadata. The output directory defaults to the source run’s parent directory (creating a sibling run). For validation runs, the validation spec is reconstructed from the stored validation_spec.json.

Parameters:

path — Path to the run directory or manifest to refresh.
output_dir — Override the output directory (default: source parent).
run_name — Override the run name.

Returns: A PipelineRunConfig ready for run_pipeline.

Raises: LifecycleError if the source run cannot be loaded or if authored-holdout refresh requirements are not met.

`refresh_run`

def refresh_run(
    path: str | Path,
    *,
    output_dir: str | Path | None = None,
    run_name: str | None = None,
) -> PipelineRunResult

Execute a non-destructive refresh run from one stored lifecycle record.

This is a convenience function that calls build_refresh_run_config followed by run_pipeline. The original run directory is never modified.

Parameters:

path — Path to the run directory or manifest to refresh.
output_dir — Override the output directory (default: source parent).
run_name — Override the run name.

Returns: A PipelineRunResult for the new run.

`compare_run_records`

def compare_run_records(
    left: str | Path,
    right: str | Path,
) -> pd.DataFrame

Compare two run records at the pinned metadata layer.

Loads both run records and compares run name, status, versions, validation spec presence, diagnostics statuses, model selection availability, and input-data provenance.

Parameters:

left — Path to the first run directory or manifest.
right — Path to the second run directory or manifest.

Returns: A pandas DataFrame with columns field, left, right, status, and changed. Rows follow a fixed order:

Row (`field`)	Description
`run_name`	Human-readable run name.
`status`	Overall run status.
`meridian_tools_version`	`meridian-tools` version.
`meridian_version`	Google Meridian version.
`has_validation_spec`	Whether a validation spec is present.
`has_diagnostics_bundle`	Whether a diagnostics bundle is present.
`predictive_accuracy_status`	Status from the diagnostics bundle.
`review_summary_status`	Status from the diagnostics bundle.
`has_model_selection_outputs`	Whether LOO/WAIC outputs are present.
`model_selection_reason_code`	Reason code if model selection is unavailable.
`input_authored_path`	YAML-owned `data.path` string.
`input_resolved_path`	Absolute runtime input path.
`input_mtime_utc`	Input file mtime.
`input_sha256`	Input file SHA-256 digest.
`input_size_bytes`	Input file size in bytes.
`input_row_count`	Input row count.
`input_column_count`	Input column count.
`input_ordered_columns`	Input CSV column order.

For provenance rows, status is "legacy_unknown" and changed is None when either run predates manifest version 3 and therefore has no stored provenance payload.

Raises: LifecycleError if either run cannot be loaded or if diagnostics or model selection artefacts are malformed.

Classes

`RunRecord`

@dataclass(frozen=True)
class RunRecord

Resolved lifecycle view over one on-disk run directory.

Attribute	Type	Description
`run_dir`	`Path`	Absolute path to the run directory.
`manifest_path`	`Path`	Absolute path to `run_manifest.json`.
`manifest`	`RunManifest`	Parsed manifest with stages, timestamps, and versions.
`config_source_path`	`Path`	Absolute path to `config.source.yaml`. Always present.
`config_resolved_path`	`Path`	Absolute path to `config.resolved.yaml`. Always present.
`input_data_provenance_path`	`Path \| None`	Path to `input_data_provenance.json`. Required for manifest version 3 and 4 runs, otherwise `None`.
`validation_spec_path`	`Path \| None`	Path to `validation_spec.json`, or `None` if absent.
`diagnostics_bundle_path`	`Path \| None`	Path to `diagnostics_bundle.json`, or `None` if absent.
`model_selection_status_path`	`Path \| None`	Path to `model_selection_status.json`, or `None` if absent.
`model_selection_warnings_path`	`Path \| None`	Path to `model_selection_warnings.json`, or `None` if absent.

Required attributes (config_source_path, config_resolved_path) are always present. input_data_provenance_path is present for manifest version 3 and 4 runs. Other optional attributes are None when the corresponding artefact was not produced by the run or is absent from the manifest.

Example:

from meridian_tools.lifecycle import load_run_record

record = load_run_record("runs/my-project_blocked_tail_20260402_073500")

# Required — always available
print(record.config_source_path)
print(record.config_resolved_path)

# Optional — may be None
if record.diagnostics_bundle_path:
    print(f"Diagnostics: {record.diagnostics_bundle_path}")
if record.validation_spec_path:
    print(f"Validation spec: {record.validation_spec_path}")

`LifecycleError`

class LifecycleError(RuntimeError)

Raised when a run directory cannot be loaded through the lifecycle contract. All lifecycle functions raise this exception type instead of generic ValueError or RuntimeError.

meridian_tools.artifacts

Manifest and JSON helpers for run artefact management.

Module: meridian_tools.artifacts

Functions

`write_json`

def write_json(path: str | Path, payload: Any) -> None

Write a JSON-serialisable payload to disk with UTF-8 encoding and 2-space indentation. Creates parent directories if they do not exist. Writes through a private same-directory temporary file and atomically replaces the destination. Existing destination permission bits are preserved on overwrite; new files use the normal mode implied by the process umask.

`write_manifest`

def write_manifest(path: str | Path, manifest: RunManifest) -> None

Serialise and write a RunManifest to disk as JSON using write_json.

`normalize_artifact_paths`

def normalize_artifact_paths(
    run_dir: str | Path,
    artifacts: Mapping[str, str | Path],
) -> dict[str, str]

Convert artefact paths to relative paths against run_dir so the manifest stores portable references.

Parameters:

run_dir — The run directory root.
artifacts — Mapping of artefact names to file paths.

Returns: Dictionary mapping artefact names to relative path strings.

`validate_artifact_paths`

def validate_artifact_paths(
    run_dir: str | Path,
    artifacts: Mapping[str, str | Path],
) -> dict[str, str]

Validate that manifest artefact paths are relative regular files beneath run_dir, then return normalised POSIX relative paths. The validator rejects absolute paths, lexical .. components, paths that resolve outside the run directory, missing paths, directories, and special files. Internal symlinks are accepted only when they resolve to regular files inside the run directory.

Parameters:

run_dir - The run directory root.
artifacts - Mapping of artefact names to file paths.

Returns: Dictionary mapping artefact names to validated relative path strings.

`timestamp_utc`

def timestamp_utc() -> str

Return the current time as a UTC ISO-8601 string with second precision.

Classes

`RunManifest`

@dataclass
class RunManifest

Machine-readable summary of one meridian-tools run.

Attribute	Type	Default	Description
`run_name`	`str`	required	Human-readable run name.
`config_path`	`Path`	required	Path to the authored YAML config file.
`output_dir`	`Path`	required	Path to the run directory.
`started_at`	`str`	required	UTC ISO-8601 start timestamp.
`manifest_version`	`int`	`CURRENT_MANIFEST_VERSION`	Schema version (0, 1, 2, 3, or 4).
`status`	`str`	`"running"`	Overall run status: `"running"`, `"completed"`, or `"failed"`.
`finished_at`	`str \| None`	`None`	UTC ISO-8601 finish timestamp. `None` while the run is in progress.
`meridian_tools_version`	`str`	`__version__`	Version of `meridian-tools`.
`meridian_version`	`str \| None`	`None`	Version of Google Meridian.
`artifacts`	`dict[str, str]`	`{}`	Top-level artefact index. Key artefacts from stages are promoted here.
`stages`	`list[StageRecord]`	`[]`	Ordered list of stage records (completed, skipped, and failed).

Class methods:

from_dict(payload: Mapping[str, Any]) -> RunManifest — Deserialise from a JSON-parsed dictionary. Supports manifest versions 0, 1, 2, 3, and 4 with default values for missing fields in older versions. Raises ValueError for unsupported versions or missing required fields.

Instance methods:

to_dict() -> dict[str, Any] — Serialise to a JSON-compatible dictionary.

`StageRecord`

@dataclass
class StageRecord

One pipeline stage entry in the run manifest.

Attribute	Type	Default	Description
`name`	`str`	required	Stage identifier (for example, `"00_run_metadata"`).
`status`	`str`	`"pending"`	Stage status: `"pending"`, `"running"`, `"completed"`, `"skipped"`, or `"failed"`.
`started_at`	`str \| None`	`None`	UTC ISO-8601 start timestamp.
`finished_at`	`str \| None`	`None`	UTC ISO-8601 finish timestamp.
`elapsed_seconds`	`float \| None`	`None`	Wall-clock seconds for stage execution.
`message`	`str \| None`	`None`	Human-readable message (skip reason or error detail).
`artifacts`	`dict[str, str]`	`{}`	Map of artefact names to relative paths. Empty for skipped stages.

Class methods:

from_dict(payload: Mapping[str, Any]) -> StageRecord — Deserialise from a JSON-parsed dictionary. Raises ValueError if name is missing.

`InputDataProvenance`

@dataclass(frozen=True)
class InputDataProvenance

Pinned input-data provenance payload used by manifest version 3 and 4 runs.

Attribute	Type	Default	Description
`authored_path`	`str`	required	Exact `data.path` string from the source YAML.
`resolved_path`	`str`	required	Absolute runtime path used for input loading.
`sha256`	`str`	required	SHA-256 digest of the resolved input file.
`size_bytes`	`int`	required	Input file size in bytes.
`mtime_utc`	`str`	required	Input file modification time in UTC ISO-8601 format.
`row_count`	`int`	required	Number of CSV data rows.
`column_count`	`int`	required	Number of CSV columns.
`ordered_columns`	`tuple[str, ...]`	required	CSV header order.
`provenance_version`	`int`	`INPUT_DATA_PROVENANCE_VERSION`	Payload schema version.

Class methods:

from_dict(payload: Mapping[str, Any]) -> InputDataProvenance — Validates the exact pinned Phase 09 key set and types.

Instance methods:

to_dict() -> dict[str, Any] — Serialise to the exact JSON payload written into input_data_provenance.json.

Constants

`CURRENT_MANIFEST_VERSION`

CURRENT_MANIFEST_VERSION: int = 4

`SUPPORTED_MANIFEST_VERSIONS`

SUPPORTED_MANIFEST_VERSIONS: tuple[int, ...] = (0, 1, 2, 3, 4)

`INPUT_DATA_PROVENANCE_VERSION`

INPUT_DATA_PROVENANCE_VERSION: int = 1

`REQUIRED_MANIFEST_ARTIFACTS`

REQUIRED_MANIFEST_ARTIFACTS: tuple[str, ...] = (
    "config_resolved",
    "config_source",
    "input_data_provenance",
    "diagnostics_bundle",
)

These artefact entries are validated at run completion time by the runner. New runs must produce all four to complete successfully.

The lifecycle loader enforces config_source and config_resolved as required for all supported manifests. It also enforces input_data_provenance for manifest version 3 and 4 runs. diagnostics_bundle remains optional, so older or partial runs can still be loaded without it.

Concepts

Background material on architecture, design decisions, and Meridian integration boundaries.

Why meridian-tools exists

Google Meridian is the modelling engine. meridian-tools is the workflow layer around that engine. It makes a Meridian modelling project easier to review, rerun, compare, hand over, and refresh without forking or modifying Meridian.

The package exists for agency MMM work where a fitted model is not the only deliverable. A team also needs to defend why one specification was selected, show exactly what was run, preserve the artefacts needed for review, and rerun or compare the work after the original notebook state has disappeared.

meridian-tools is therefore not a replacement for Meridian. It is the operating standard around Meridian.

The problem it solves

A bare modelling workflow can fit a Meridian model, but an agency workflow has more obligations:

defend a model choice on out-of-sample grounds, not only visual fit;
declare validation before fitting, rather than after seeing results;
keep the authored configuration and the resolved execution configuration;
record which input data, library versions, and artefacts produced the result;
hand a stable run directory to another analyst, reviewer, or client team;
refresh or compare stored runs without relying on notebook memory.

Those obligations are not theoretical. They are the difference between “this model ran on my machine” and “this model can be reviewed six months later”.

The evaluation gap

In-sample fit metrics such as R² or MAPE are useful diagnostic summaries, but they are not sufficient model-selection criteria. They are computed on data the model has already seen. A more flexible specification can improve in-sample fit while degrading on future weeks.

Expected log predictive density (ELPD) estimates expected performance on data the model has not seen. That is the relevant question when choosing between candidate MMM specifications: which specification is most likely to generalise?

meridian-tools adds a compatibility-aware model-selection layer on top of Meridian and ArviZ:

Need	`meridian-tools` surface
Compute PSIS-LOO	`compute_loo(...)`
Compute WAIC	`compute_waic(...)`
Compare candidate models	`compare_models(...)`
Inspect pointwise reliability	`loo_pointwise.csv` with `pareto_k`
Record incompatibility	`model_selection_status.json` with a reason code
Record ArviZ warnings	`model_selection_warnings.json`

The important integration point is log-likelihood reconstruction. ArviZ needs a pointwise log_likelihood group before it can compute LOO or WAIC. A fitted Meridian model does not expose that group as a ready-to-use workflow artefact. meridian-tools reconstructs it for supported Meridian versions and passes a temporary InferenceData copy to ArviZ. The original Meridian model is not mutated.

Model comparison is still a statistical judgement. In the comparison table, elpd_diff must be read against dse, the standard error of the difference. If the ELPD difference is small relative to its uncertainty, prefer the simpler or more interpretable specification rather than treating rank as a mechanical decision rule.

A principled Bayesian workflow

meridian-tools keeps ownership explicit. Meridian remains responsible for the model. The wrapper owns the workflow controls around the model.

Workflow stage	Where it happens	Owner
Prior specification and contract validation	YAML config and `priors.py`	`meridian-tools`
Holdout planning before fitting	`validation` config and `cv.py`	`meridian-tools`
Posterior sampling	Meridian	Meridian
Meridian diagnostics	Meridian outputs staged by the runner	Meridian, packaged by `meridian-tools`
Out-of-sample information criteria	`compute_loo(...)`, `compute_waic(...)`	`meridian-tools`
Candidate model comparison	`compare_models(...)`	`meridian-tools`
Stored run refresh and comparison	`lifecycle.py`	`meridian-tools`

Declaring validation in configuration is itself a control. It prevents the holdout window from being chosen after results are known. blocked_tail executes one contiguous tail holdout. rolling_origin materialises multiple expanding-window splits through the Python API. In both cases, validation fits are kept separate from the final full-sample production fit.

Reproducibility and governance

Every completed run is a directory that can be inspected without the original notebook. The key files are:

Artefact	What it answers
`00_run_metadata/config.source.yaml`	What did the analyst author?
`00_run_metadata/config.resolved.yaml`	What did the system actually run?
`00_run_metadata/input_data_provenance.json`	Which data snapshot was loaded?
`run_manifest.json`	Which stages ran, with which versions and artefacts?
`30_model_assessment/*`	What diagnostics and model-selection evidence were produced?

The provenance record includes the authored path, resolved path, SHA-256 hash, row count, column count, ordered columns, file size, modification time, and provenance schema version. The manifest records meridian-tools and Meridian versions, timestamps, run status, stage status, and a validated inventory of top-level artefacts.

That gives a reviewer three concrete answers:

What was modelled? The archived source config and data provenance.
What did the system execute? The resolved config and manifest.
Can it be rerun or compared later? The run record, lifecycle helpers, seeds, pinned dependency boundary, and stored artefact paths.

New manifest version 4 runs also validate completed artefact paths before the final manifest is written. The validation requires recorded artefact paths to be relative paths under the run directory and to resolve to existing regular files. A completed manifest should not point to missing files or paths outside the run directory.

Why not modify Meridian directly?

Meridian remains an unmodified upstream dependency. The current supported boundary pins google-meridian[schema]==1.5.3 and constrains related runtime dependencies where needed for that version.

A private fork or local patch set would increase upgrade risk, complicate support, and blur the boundary between Google’s modelling library and agency-specific workflow requirements. Keeping the split explicit means a Meridian upgrade is a version change, a compatibility review, and a release gate, not a merge from a private modelling fork.

This is also the right separation of concerns. Meridian should focus on the MMM model. meridian-tools should focus on the repeatable agency workflow around that model.

Capability map

This table describes the repository’s supported Meridian 1.5.3 workflow boundary. It is not a claim about every possible Meridian usage pattern or future Meridian release.

Capability	Meridian role	`meridian-tools` role
Core MMM fitting	Owns the model and posterior sampling	Delegates to Meridian
Model serialisation	Provides schema serialisation	Stages `meridian_model.binpb` in the run directory
Project configuration	Accepts model inputs through Meridian APIs	Provides a typed YAML project surface
Validation planning	Accepts holdout masks	Builds blocked-tail, rolling-origin, and authored-holdout run specs
Diagnostics	Provides diagnostic outputs and plots	Packages diagnostics into stable artefacts
LOO and WAIC	Supplies fitted model state	Reconstructs log likelihood and calls ArviZ
Model comparison	Not the workflow owner	Provides `compare_models(...)` and run artefacts
Run provenance	Not the workflow owner	Writes config archives, data provenance, and manifests
Stored run lifecycle	Not the workflow owner	Loads, compares, and refreshes stored run records
Client handoff	Analyst-owned without a wrapper	Provides a predictable staged output directory

When should a team use this?

Use meridian-tools when at least one of these is true:

you need to compare candidate Meridian specifications;
the model will be refreshed later;
another analyst must review or rerun the work;
the result will be handed to a client or internal governance process;
you need a repeatable CLI or YAML workflow rather than notebook-only state.

Do not add the wrapper for its own sake. A single exploratory Meridian fit with no comparison, no refresh cadence, and no handoff requirement may not need this layer. The value appears when the model becomes part of an operating process. For a shorter decision note, see the adoption brief.

What this does not do

The boundary is deliberately narrow.

It does not replace Meridian’s model or sampler.
It does not compute convergence diagnostics such as R-hat, ESS, or divergences itself. It stages Meridian diagnostic outputs and adds workflow artefacts around them.
It does not provide a full prior-predictive checking policy. It can request Meridian prior sampling through fit.sample_prior_draws, validates configured prior contracts, and records resolved prior distributions, but it does not add a wrapper-owned pass/fail rule for prior predictive checks.
It does not make LOO or WAIC available for holdout-fitted models. Those runs record model_selection_status.json because comparing holdout-fit ELPD to full-fit ELPD would be statistically ambiguous.
It does not make a run perfectly reproducible across all hardware, dependency versions, random seeds, or future Meridian releases. It records the conditions needed for a bounded, reviewable rerun.

These limits are intentional. A wrapper that hides Meridian’s responsibilities would be harder to trust. meridian-tools is valuable because it makes the boundary visible.

The short version

Meridian gives the agency a modelling engine. meridian-tools gives the agency a repeatable scientific workflow around that engine: validation plans, model-selection evidence, provenance, manifests, stable artefacts, and lifecycle operations.

Adoption brief for agency teams

This page is the short version for teams deciding whether to use meridian-tools around Google Meridian.

Recommendation

Use meridian-tools when a Meridian model must be reviewed, compared, refreshed, or handed to another team. Do not use it merely because it exists. A one-off exploratory Meridian fit with no model comparison, no refresh cadence, and no handoff requirement may not need this layer.

What breaks without the wrapper?

Agency requirement	Risk without `meridian-tools`
Defensible model choice	Candidate specifications can be selected on in-sample fit or visual judgement alone.
Reviewable execution	The authored config, resolved config, data provenance, and tool versions can be scattered across notebooks and local state.
Client or internal handoff	The output set can depend on analyst habits rather than a stable directory contract.
Refresh cadence	A later run may not know exactly which config, data snapshot, validation window, or package versions created the earlier result.
Compatibility management	Meridian upgrades can silently affect wrapper-owned seams such as schema serialisation or log-likelihood reconstruction unless they are gated.

What does adoption cost?

Cost	Practical meaning
Dependency boundary	Use the supported Meridian-compatible environment rather than arbitrary package versions.
YAML config	Each project needs one reviewed config that defines data mapping, model settings, fit settings, validation, and exports.
Run directory convention	Analysts inspect staged artefacts under a predictable output directory rather than ad hoc notebook outputs.
Validation discipline	Holdout choices are declared before fitting and validation fits are not reused as final production fits.

What does the team get?

Need	Evidence produced
What was modelled?	`config.source.yaml` and input-data provenance.
What actually ran?	`config.resolved.yaml` and `run_manifest.json`.
Which model should we prefer?	LOO/WAIC summaries, pointwise diagnostics, and `model_comparison.csv` for compatible final fits.
Can another analyst inspect it?	Staged assessment, decomposition, response-curve, optimisation, and manifest artefacts.
Can it be refreshed later?	Lifecycle helpers that load, compare, and refresh stored run records.

What does it not do?

meridian-tools does not replace Meridian’s model, sampler, or diagnostics. It does not make every run perfectly reproducible across all hardware and future dependency versions. It records the conditions needed for a bounded, reviewable rerun.

It also does not make LOO or WAIC meaningful for holdout-fitted models. Those runs deliberately record model_selection_status.json instead.

Decision rule

Adopt the wrapper when the model is part of an operating process. Skip it when the work is a disposable exploration.

If the model is going to a client, an internal reviewer, or a future refresh, the wrapper is not overhead. It is the audit trail.

Architecture

meridian-tools is a companion package designed for agency teams that use Google Meridian as their client-facing MMM (Marketing Mix Modelling) engine. It provides a stricter, more reproducible workflow around Meridian without forking the upstream library.

Core philosophy

No forking — meridian-tools strictly wraps Meridian. It does not modify Meridian’s internal code or model implementations.
Bounded reproducibility — Runs are driven by typed YAML configurations, archived source/resolved configs, manifest metadata, and input-data provenance. These records support repeatable execution in the documented dependency environment, but they do not guarantee identical posterior draws across all hardware, dependency versions, random seeds, or Meridian changes.
Structured workflow — The package enforces a staged execution pipeline (validation, model fit, assessment, decomposition, response curves, optimisation).
Lifecycle management — Runs are treated as immutable artefacts with rich metadata, allowing for easy comparison, refreshing, and storage.

Module map

meridian_tools/
├── __init__.py          Lazy-loading package exports
├── artifacts.py         Manifest and JSON helpers
├── cli.py               CLI entry point (argparse)
├── config.py            Pydantic YAML models
├── cv.py                Validation split logic
├── demo.py              Bundled demo discovery
├── diagnostics.py       Diagnostics export
├── exports.py           Meridian analysis surface wrappers
├── launcher.py          Run execution wrapper
├── lifecycle.py         Post-run record management
├── log_likelihood.py    Log-likelihood reconstruction adapter
├── model_selection.py   ArviZ LOO/WAIC wrappers
├── terminal.py          CLI presentation and warning grouping
└── version.py           Static version

Layered import design

Meridian and TensorFlow are never imported at module level in the configuration, validation, or CLI layers. This means lightweight operations respond instantly:

Operation	Imports loaded
`meridian-tools --help`	`pydantic`, `yaml`
`load_yaml_config(path)`	`pydantic`, `yaml`
`build_validation_plan(...)`	`numpy`
`run_pipeline(...)`	Everything (Meridian, TF, ArviZ, etc.)

The __init__.py uses __getattr__-based lazy loading so that import meridian_tools does not trigger heavy dependency imports.

Pipeline execution model

The runner executes stages sequentially. Each stage:

Creates a StageRecord and appends it to the in-memory manifest.
Calls the stage function, which returns a dict[str, Path] of artefacts.
Normalises artefact paths to be relative to the run directory.
Writes the updated manifest to disk.

This design means a crash mid-pipeline leaves a readable partial manifest on disk. The last entry in the stages array is the last successfully completed stage.

┌─────────────────────┐
│  00_run_metadata    │  Archive source + resolved configs
├─────────────────────┤
│  10_validation      │  Write validation spec (if applicable)
├─────────────────────┤
│  20_model_fit       │  Build data → build model → sample posterior
├─────────────────────┤
│  30_model_assessment│  Diagnostics + model selection + summary
├─────────────────────┤
│  40_decomposition   │  Summary metrics (NetCDF + CSV)
├─────────────────────┤
│  60_response_curves │  Response curves (if configured)
├─────────────────────┤
│  70_optimisation    │  Budget optimisation (if configured)
└─────────────────────┘

The numbering gap at 50 reserves space for future stages without renumbering.

Configuration architecture

The separation between authored YAML and runtime-only config is strict:

MeridianToolsConfig — Pydantic model for the YAML file. Owns project metadata, data paths, model spec, fit settings, validation strategy, and export switches.
PipelineRunConfig — Frozen dataclass for runtime options. Owns output directory, run name, and concrete validation spec.

The runner writes two config copies to each run directory:

config.source.yaml — Verbatim copy of the input YAML.
config.resolved.yaml — After relative path resolution. Never includes runtime-only fields.

Artefact path normalisation

All artefact paths in manifests are stored relative to the run directory and validated as regular files beneath that directory. New manifest version 4 runs reject absolute paths, lexical .. components, paths that resolve outside the run directory, directories, missing paths, and special files. This makes run directories portable while keeping manifest consumers fail-closed.

The lifecycle layer resolves accepted paths back to absolute paths at load time.

Meridian coupling boundaries

Coupling level	Modules	Surface used
Public API	`runner.py`, `exports.py`	`Meridian`, `ModelSpec`, `CsvDataLoader`, `Analyzer`, `Summarizer`, `BudgetOptimizer`
Semi-public	`log_likelihood.py`, `exports.py`	`model_context`, `inference_data`, `input_data`
Private	`log_likelihood.py`	`_get_joint_dist_unpinned`, `_prepare_latents_for_reconstruction`, `_reconstruct_posteriors`

The private-API coupling is confined to log_likelihood.py and wrapped in comprehensive error handling. See Meridian integration for details.

Data flow

Input — A typed YAML file defines the entire run scope.
Initialisation — The runner resolves the config and creates a timestamped run directory.
Execution — The pipeline steps through stages, maintaining a central state dictionary with the fitted model and intermediate results.
Export — Each stage writes specific artefacts to disk within the run directory.
Finalisation — The manifest is completed with status: "completed" and finished_at, locking the run state.
Lifecycle — Downstream processes or analysts consume artefacts or use lifecycle tools to compare, refresh, or audit runs.

Design decisions

This document records the key design decisions in meridian-tools and the reasoning behind them. It is intended for maintainers and contributors who need to understand why things are built the way they are.

No IID cross-validation

Decision: meridian-tools does not implement random-shuffle or naive k-fold cross-validation.

Reasoning: MMM data is time series. Random IID splits break temporal structure, leading to data leakage where future observations inform training and past observations appear in the test set. This produces optimistic accuracy estimates that do not reflect real-world forecasting performance.

The package provides two time-respecting alternatives:

Blocked tail — reserves the most recent observations as a single test block.
Rolling origin — expanding-window forward-chaining that respects temporal ordering at every split.

Non-overlapping rolling-origin test windows

Decision: step_size must equal test_size for rolling-origin splits.

Reasoning: Overlapping test windows would mean the same observation appears in multiple test sets. This violates the independence assumption needed for comparing validation scores across splits and complicates the interpretation of aggregate metrics. Non-overlapping windows ensure each observation is evaluated exactly once across the split plan.

Minimum two splits for rolling origin

Decision: build_rolling_origin_splits requires at least two splits.

Reasoning: A single rolling-origin split is functionally identical to a blocked-tail holdout and provides no comparative signal. If your data only supports one split, use blocked_tail instead — it communicates the intent more clearly.

Holdout restriction for model selection

Decision: LOO and WAIC are only available for models where holdout_id is None.

Reasoning: LOO and WAIC estimate expected log predictive density (ELPD) using the full observed likelihood surface. A model fitted with a holdout mask has a modified likelihood that excludes held-out observations. Computing LOO on this truncated likelihood would produce ELPD estimates that are not comparable to those from full-sample fits.

The correct workflow is:

Use validation splits for candidate evaluation.
Select the best specification based on holdout performance.
Refit the chosen specification on the full dataset.
Compute LOO/WAIC on the full-sample fit for model quality reporting.

Separation of validation fits and final fits

Decision: Validation runs and final production fits are separate pipeline executions that produce separate run directories.

Reasoning: A validation fit is trained on a subset of the data. Its posterior reflects that subset and should not be used as the production artefact. Keeping them as separate runs prevents accidental use of a validation fit for downstream analysis or reporting.

Lazy imports for CLI responsiveness

Decision: Heavy dependencies (TensorFlow, NumPy, Meridian, ArviZ) are not imported at module level in the config, CLI, or validation layers.

Reasoning: TensorFlow alone takes several seconds to import. The CLI must respond instantly for --help and --list operations. The __init__.py uses __getattr__-based lazy loading, and the test suite verifies that build_parser() only loads pydantic and yaml.

Pydantic `extra="forbid"` everywhere

Decision: All configuration models reject unexpected keys.

Reasoning: Silent acceptance of unknown keys is a common source of misconfiguration in YAML-driven tools. A typo like export_pridictive_accuracy would be silently ignored without extra="forbid", leading to unexpected default behaviour. Strict rejection catches these errors at config load time with clear error messages.

Relative artefact paths in manifests

Decision: All artefact paths in run_manifest.json are stored relative to the run directory.

Reasoning: Absolute paths would tie run directories to a specific machine or filesystem layout. Relative paths make run directories portable — they can be copied, archived, or moved between machines without breaking the manifest contract.

Non-destructive lifecycle operations

Decision: refresh_run creates a new sibling directory rather than overwriting the source.

Reasoning: Overwriting a validated production run would destroy the audit trail. Creating a sibling preserves the original for comparison and rollback. The lifecycle layer explicitly validates that source directories are not mutated by refresh operations.

Manifest-per-stage persistence

Decision: The manifest is written to disk after each stage completes, not only at the end of the pipeline.

Reasoning: MCMC sampling can run for minutes to hours. If the process crashes or is killed during a later stage, the partial manifest on disk reflects what completed successfully. This aids debugging and allows partial runs to be inspected without special tooling.

Stage numbering with gaps

Decision: Pipeline stages use numbers 00, 10, 20, 30, 40, 60, 70 with a gap at 50.

Reasoning: The gaps allow future stages to be inserted at natural positions (e.g. a stage 50 for custom analysis) without renumbering existing stages. Renumbering would break backward compatibility with stored manifests and any downstream tooling that references stage names.

Config source vs. resolved archival

Decision: Both the verbatim source YAML and the resolved YAML are archived in every run directory.

Reasoning: The source YAML shows what the analyst authored (including relative paths). The resolved YAML shows the runtime interpretation (absolute paths, defaults applied). Both are needed for reproducibility:

The source is needed to understand intent.
The resolved config is needed to reproduce the exact execution.

Runtime-only fields (output_dir, run_name, validation_spec) are deliberately excluded from the resolved config because they are not part of the reproducible model specification.

Structured model selection errors

Decision: Model selection failures produce ModelSelectionError with a machine-readable reason_code rather than generic exceptions.

Reasoning: The pipeline needs to distinguish between “model selection is not possible for this run type” (expected) and “something is broken” (unexpected). Structured reason codes allow:

The runner to write model_selection_status.json without failing the run.
The lifecycle layer to compare model selection availability across runs.
Downstream consumers to programmatically handle different failure modes.

Meridian integration

This document describes how meridian-tools integrates with Google Meridian, the boundaries of that integration, and the risks associated with different coupling levels.

Integration philosophy

meridian-tools wraps Meridian without forking it. Meridian remains the modelling engine; meridian-tools adds workflow orchestration, validation, diagnostics bundling, model selection, and lifecycle management on top.

This approach means:

Meridian upgrades can be adopted without merging fork changes.
The upstream project’s API stability directly affects meridian-tools.
Any use of Meridian-internal APIs must be explicitly managed.

Coupling levels

Public API (low risk)

These are documented, versioned Meridian surfaces:

Surface	Used by
`Meridian` (model class)	`runner.py`
`ModelSpec`	`runner.py`
`CsvDataLoader`, `CoordToColumns`	`runner.py`
`Analyzer`	`exports.py`, `diagnostics.py`
`Summarizer`	`exports.py`
`BudgetOptimizer`	`exports.py`
`ModelReviewer`	`diagnostics.py`
`MediaEffects`, `MediaSummary`, `ModelDiagnostics`, `ModelFit`	`exports.py`
`save_meridian` (schema serde)	`exports.py`

These are unlikely to break without a Meridian major version bump. The exact google-meridian==1.5.3 pin keeps these assumptions aligned with the validated release baseline.

Semi-public API (medium risk)

These are accessible attributes on Meridian model objects that are used but not formally documented as stable:

Surface	Used by	Purpose
`model.inference_data`	`log_likelihood.py`, `model_selection.py`	Access ArviZ InferenceData
`model.model_context`	`log_likelihood.py`, `exports.py`	Access model structure
`model.input_data`	`exports.py`	Access input data for spend computation
`model.posterior_sampler_callable`	`log_likelihood.py`	Access posterior sampler

These are stable in practice (they are used by Meridian’s own analysis surfaces) but are not guaranteed to be stable across versions.

Private API (high risk)

These are _-prefixed methods on Meridian’s posterior_sampler_callable, used exclusively in log_likelihood.py for log-likelihood reconstruction:

_get_joint_dist_unpinned
_prepare_latents_for_reconstruction
_reconstruct_posteriors

These methods are Meridian-internal and may change or be removed in any Meridian release, including patch versions. They are necessary because Meridian does not provide a public API for pointwise log-likelihood computation.

Risk mitigation

Compatibility guard

log_likelihood.py checks for the presence of all three private methods before attempting reconstruction:

required_sampler_methods = (
    "_get_joint_dist_unpinned",
    "_prepare_latents_for_reconstruction",
    "_reconstruct_posteriors",
)
if any(not hasattr(posterior_sampler, method) for method in required_sampler_methods):
    raise ModelSelectionError(
        "...",
        reason_code="meridian_internal_seam_incompatible",
    )

If any method is missing, the error is caught and recorded as a model_selection_status.json artefact with reason_code: meridian_internal_seam_incompatible. The rest of the pipeline continues normally.

Graceful degradation

Model selection incompatibility is non-fatal at every level:

log_likelihood.py raises ModelSelectionError with a structured code.
model_selection.py propagates the error.
runner.py catches it, writes model_selection_status.json, and continues.
The manifest records the assessment stage as completed.
The lifecycle layer can inspect model_selection_status to understand why model selection was unavailable.

Version pinning

The pyproject.toml pins Meridian to google-meridian[schema]==1.5.3 and constrains protobuf to >=5.28.0,<7 for Meridian schema serialisation. Any Meridian or protobuf-bound upgrade must refresh the private log-likelihood reconstruction and schema-save baselines before the guards are relaxed.

Integration testing

The test suite includes a gated live Meridian verification command:

MERIDIAN_TOOLS_ENABLE_REAL_FIT=1 pytest tests/test_demo_integration.py::test_real_pipeline_refresh_smoke tests/test_log_likelihood.py::test_compute_log_likelihood_dataset_real_posterior_smoke -m real_fit -v

This command proves two different real seams:

one reduced real pipeline run over bundled demo data, including stored-run refresh after the original YAML is removed
the lower-level live log-likelihood reconstruction path

It is excluded from the default test suite because it requires real MCMC sampling, but it should be run after every Meridian version upgrade.

Constants dependency

log_likelihood.py uses Meridian constants for posterior parameter names:

from meridian import constants
# constants.BETA_GM, constants.TAU_G, constants.ETA_M, etc.

These are stable string constants but are not versioned. A Meridian release that renames these constants would cause import-time failures.

Unsaved posterior parameter recovery

Meridian does not persist all posterior parameters to InferenceData. The _recover_unsaved_state function in log_likelihood.py reconstructs:

tau_g_excl_baseline — Recovered from the posterior’s tau_g variable by slicing out the baseline geo index (concatenating the elements before and after baseline_geo_idx).
Geo deviations — Recovered from the posterior by solving deviation = (target - base) / scale for normal effects, or deviation = (log(target) - base) / scale for log-normal effects, with a scale == 0 guard that maps to zero.

This recovery is mathematically correct for the supported model families (log-normal and normal media effects). It is tested against both geo-panel and national models in test_log_likelihood.py.

What breaks on a Meridian upgrade

Change type	Impact	Detection
Public API signature change	`runner.py`, `exports.py` break	Default test suite
Semi-public attribute rename	`log_likelihood.py`, `exports.py` break	Default test suite
Private method removal/rename	Model selection disabled	Live smoke test or `model_selection_status.json`
Constant rename	Import-time failure	Default test suite
New posterior parameter	Log-likelihood may be incorrect	Manual review + live smoke test
Changed likelihood formula	Log-likelihood may be incorrect	Live smoke test

Recommended upgrade procedure

Pin the new Meridian version in a branch.
Run the canonical verification gate: python scripts/verify_release.py.
Run the live Meridian verification command: MERIDIAN_TOOLS_ENABLE_REAL_FIT=1 pytest tests/test_demo_integration.py::test_real_pipeline_refresh_smoke tests/test_log_likelihood.py::test_compute_log_likelihood_dataset_real_posterior_smoke -m real_fit -v.
If model selection breaks, check model_selection_status.json for the reason code.
If private methods changed, update log_likelihood.py to match the new Meridian internals or accept graceful degradation.
Update docs/project/release-baseline.md with the new verified state.

Project

Contributor-facing project documentation, release baselines, and changelog material.

Pages

Contributing — This guide covers the development setup, conventions, and workflow for contributing to meridian-tools.
Acceptance checklist — Use this page as the canonical local acceptance checklist for the current repository state. The acceptance gate is local and command-driven, and it uses the same script as CI.
Release baseline — This page records the 0.4.0 release-candidate baseline for the repository. Treat it as a validated project state, not as an automated release system. The baseline uses the canonical constrained verification script and records the observed warning profile, the direct runtime dependency bounds, and the accepted trade-offs that still shape the package.
Changelog — All notable changes to meridian-tools are documented in this file.
Meridian Compatibility Inventory — This project currently targets google-meridian[schema]==1.5.3. The local reference checkout used for compatibility review is:

Contributing

This guide covers the development setup, conventions, and workflow for contributing to meridian-tools.

Development setup

Clone and install

git clone <repo-url> meridian-tools
cd meridian-tools
python -m pip install -U pip
python -m pip install -c constraints/dev.txt -e ".[dev]"

The constraints file pins the supported development environment, including the Meridian-compatible ArviZ line.

Verify the install

python scripts/verify_release.py

Acceptance gate

Before submitting any change, run the full acceptance sequence from the repository root:

python -m pip install -c constraints/dev.txt -e ".[dev]"
python scripts/verify_release.py

See acceptance.md for the expected results and how to interpret failures.

Code style

Formatting and linting

The project uses Ruff for both linting and formatting:

# Check
ruff check src tests
ruff format --check src tests

# Auto-fix
ruff check --fix src tests
ruff format src tests

Configuration is in pyproject.toml:

[tool.ruff]
line-length = 120
target-version = "py311"

[tool.ruff.lint]
select = ["E", "F", "I", "UP", "B", "C90", "SIM", "RUF"]

Type annotations

All public functions and classes use type annotations. The codebase uses from __future__ import annotations for forward-reference support.

Import conventions

Standard library imports first, then third-party, then local.
Heavy dependencies (Meridian, TensorFlow, ArviZ) are imported lazily inside functions, not at module level, in the config/CLI/validation layers.
Ruff rule I enforces import sorting.

Configuration models

All Pydantic models use ConfigDict(extra="forbid"). New config fields must be added with appropriate types, defaults, and validators.

Testing

Running tests

# Full suite
python scripts/verify_release.py

# Specific file
pytest tests/test_runner.py -v

# Specific test
pytest tests/test_runner.py::test_run_pipeline_writes_manifest -v

Test conventions

Tests use pytest with tmp_path for temporary directories.
monkeypatch is used extensively to mock Meridian internals and isolate unit tests from real MCMC sampling.
Module-scoped fixtures (scope="module") are used for expensive model construction in test_log_likelihood.py and test_model_selection.py.
Shared test infrastructure is defined inline in individual test modules. There is no top-level conftest.py.

Live Meridian verification

One opt-in command exercises the bounded real Meridian seam:

MERIDIAN_TOOLS_ENABLE_REAL_FIT=1 pytest tests/test_demo_integration.py::test_real_pipeline_refresh_smoke tests/test_log_likelihood.py::test_compute_log_likelihood_dataset_real_posterior_smoke -m real_fit -v

This is not part of the default suite. It proves one reduced real pipeline run over bundled demo data, one stored-run refresh after the original YAML is removed, and the lower-level live log-likelihood seam. Run it after Meridian version upgrades and before release-candidate handoff when you want extra confidence beyond the fast suite.

Writing new tests

Place tests in the appropriate tests/test_<module>.py file.
Use monkeypatch to avoid real MCMC sampling in unit tests.
Test both success paths and error conditions.
Verify artefact file contents, not just their existence.
Use tmp_path for all filesystem operations.

Project structure

meridian-tools/
├── src/meridian_tools/       # Package source
│   ├── __init__.py           # Lazy-loading exports
│   ├── artifacts.py          # Manifest helpers
│   ├── cli.py                # CLI entry point
│   ├── config.py             # Pydantic models
│   ├── cv.py                 # Validation splits
│   ├── demo.py               # Demo discovery
│   ├── diagnostics.py        # Diagnostics export
│   ├── exports.py            # Meridian export wrappers
│   ├── launcher.py           # Run execution wrapper
│   ├── lifecycle.py          # Post-run management
│   ├── log_likelihood.py     # Log-likelihood adapter
│   ├── model_selection.py    # LOO/WAIC wrappers
│   ├── terminal.py           # CLI presentation
│   └── version.py            # Static version
├── tests/                    # Test suite
│   ├── _demo_data/           # Bundled demo data (packaged)
├── docs/                     # Documentation
├── runme.py                  # Source-tree demo launcher
└── pyproject.toml            # Build and dependency config

Versioning

The version is defined in src/meridian_tools/version.py:

__version__ = "0.4.0"

Version bumps are manual edits. Update this file when preparing a release.

Documentation

Documentation lives in docs/. When adding new features:

Update relevant guide or reference pages.
Add API documentation for new public functions or classes.
Update the YAML schema reference if config fields changed.
Update the output schema if new artefacts are produced.

Common pitfalls

Do not import Meridian at module level in config, CLI, or validation modules. This breaks CLI responsiveness.
Do not add extra="allow" to Pydantic models. The extra="forbid" policy prevents silent misconfiguration.
Do not modify source run directories in lifecycle operations. Always create new sibling directories.
Do not weaken or delete existing tests without explicit direction.

Acceptance checklist

Use this page as the canonical local acceptance checklist for the current repository state. The acceptance gate is local and command-driven, and it uses the same script as CI.

Acceptance gate

Run the following commands from the repository root:

python -m pip install -c constraints/dev.txt -e ".[dev]"
python scripts/verify_release.py

The canonical acceptance-gate result for the full test step in the current constrained environment is:

485 passed, 2 skipped

That result is the pass or fail line for the default local acceptance gate. The recorded warning profile belongs to the release baseline, not to the acceptance-gate definition itself.

What each command proves

python -m pip check proves that the active constrained environment has no broken package requirements.

python -m compileall src tests scripts runme.py proves that the checked-in Python files parse cleanly. If this step fails, you are dealing with a syntax or import-time parse issue and you should stop there.

python -m meridian_tools.cli --help and python runme.py --help prove that the installed CLI and source-tree launcher surfaces still import cleanly.

ruff check src tests scripts runme.py proves that the repository still satisfies the pinned lint rules. If this step fails, fix the reported lint violations before moving on.

ruff format --check src tests scripts runme.py proves that the checked-in files still match the agreed formatting contract. If this step fails, run the formatter and then rerun the verification sequence.

mypy src proves that the configured static typing baseline still runs cleanly. If this step fails, either fix the reported type issue or update the documented ratchet intentionally.

python docs-site/build_content.py, git diff --exit-code -- docs-site/content, and hugo --source docs-site prove that canonical Markdown, generated Hugo content, and the static site build stay aligned.

pytest --cov=src/meridian_tools --cov-report=term-missing proves the behavioural contract of the repository. This is the broadest local validation step. If it fails, use the failing test names to identify which package contract regressed.

How to interpret failure

If the compile step fails, fix syntax or parse problems first. The later steps will not give you useful signal until that is resolved.

If lint, format, or type checks fail, treat that as a source-tree quality issue, not as an optional clean-up item. Bring the tree back to the pinned Ruff and mypy state before trusting the rest of the loop.

If CLI help fails, assume the published command surface is broken even if the Python modules still import manually.

If the docs drift check fails, update the canonical files under docs/ first, then regenerate docs-site/content.

If the pytest coverage step fails, the acceptance gate is not met. A partial pass is not enough. Fix the failing behavioural contract and rerun the full command sequence.

Optional extra confidence

The repository also carries one opt-in live Meridian verification command for extra technical confidence:

MERIDIAN_TOOLS_ENABLE_REAL_FIT=1 pytest tests/test_demo_integration.py::test_real_pipeline_refresh_smoke tests/test_log_likelihood.py::test_compute_log_likelihood_dataset_real_posterior_smoke -m real_fit -v

This command is not part of the default blocking acceptance gate. It exists to provide one bounded live Meridian route that proves:

real pipeline execution over bundled demo data
manifest-backed stored-run refresh after the original YAML is removed
the lower-level live log-likelihood reconstruction seam

On the reference development environment, the recorded run finished in 135.42 seconds (0:02:15), with command elapsed time of 2:24.03; keep a budget of roughly six minutes or less for this extra-confidence command.

Release baseline

This page records the 0.4.0 release-candidate baseline for the repository. Treat it as a validated project state, not as an automated release system. The baseline uses the canonical constrained verification script and records the observed warning profile, the direct runtime dependency bounds, and the accepted trade-offs that still shape the package.

Release-ready definition in this repository

The repository is release-ready only when the documented local acceptance command set passes, the coverage test step returns the recorded pass/skip count below, the same validated run is recorded with the observed warning profile, the warning categories match the accepted ones below, and the accepted trade-offs remain explicit rather than hidden.

Validated baseline record

The current verified local baseline is:

python -m pip install -c constraints/dev.txt -e ".[dev]"
python scripts/verify_release.py
-> 485 passed, 2 skipped, 83 warnings, 91% coverage
-> pytest runtime 65.27s (0:01:05); command elapsed 1:12.27

The optional extra-confidence live path remains separate:

MERIDIAN_TOOLS_ENABLE_REAL_FIT=1 pytest tests/test_demo_integration.py::test_real_pipeline_refresh_smoke tests/test_log_likelihood.py::test_compute_log_likelihood_dataset_real_posterior_smoke -m real_fit -v
-> 2 passed, 161 warnings in 135.42s (0:02:15); command elapsed 2:24.03

That command remains opt-in local confidence, not the default developer loop or silent CI policy. On the reference development environment, the recorded run finished in 135.42 seconds (0:02:15), with command elapsed time of 2:24.03; keep a budget of roughly six minutes or less for ordinary local execution.

Certified environment

Python 3.12.11
meridian-tools==0.4.0
google-meridian==1.5.3
arviz==0.19.0
matplotlib==3.9.4
protobuf==6.33.6
ruff==0.14.0
mypy==1.18.2
pytest==8.4.2
Hugo 0.164.0

Runtime dependency boundary

The current runtime boundary recorded from pyproject.toml is:

requires-python >=3.11
google-meridian==1.5.3
arviz>=0.18.0,<0.20.0
constrained release gate uses arviz==0.19.0 and matplotlib==3.9.4
protobuf>=5.28.0,<7
constrained release gate uses protobuf==6.33.6
pandas>=2.2.0,<3
pydantic>=2.8.0,<3
PyYAML>=6.0.0,<7

These are the direct runtime dependency bounds for the milestone baseline. This page does not imply broader environment reproducibility than the constrained gate currently implements.

Accepted warning profile

The recorded warning profile is accepted in the current milestone baseline. The warnings fall into these pinned categories:

Meridian model / prior warnings
ArviZ model-selection warnings
Matplotlib/Pyparsing deprecation warnings from the constrained ArviZ compatibility line
Meridian schema/protobuf deprecation warnings caused by Meridian 1.5.3 serialising boolean TensorFlow Probability parameters through an int proto path; protobuf 7 rejects that path, so this release candidate constrains protobuf below 7
Hugo Relearn theme deprecation warnings during static-site build

This baseline does not pretend the repository is warning-free. It records the current observed warning profile honestly and treats those warning categories as accepted for the present milestone.

Accepted trade-offs

The current release baseline also depends on several explicit trade-offs.

The package takes a no-fork Meridian approach. We keep Meridian as the modelling engine and add workflow and compatibility tooling around it rather than modifying Meridian source.

Bayesian model selection remains intentionally limited to fitted Meridian models where holdout_id is None. Validation-fit and authored-holdout runs are not treated as compatible LOO or WAIC candidates.

Lifecycle tooling remains Python-first. The repository does not currently ship a broader lifecycle CLI.

Version bumping remains a manual edit rather than a fully automated release pipeline.

Boundary of this record

This page records one validated milestone state. It does not introduce CI as the source of truth. It does not define publish automation. It does not promise zero warnings. It does not claim a broader release process than the repository actually supports today.

Changelog

All notable changes to meridian-tools are documented in this file.

The format is based on Keep a Changelog.

[Unreleased]

No unreleased changes.

[0.4.0] — 2026-07-29

Added

YAML media priors — YAML-driven media prior configuration for roi_m, mroi_m, and alpha_m, including scalar priors, per-channel overrides, and support for Normal, LogNormal, TruncatedNormal, and Beta distributions.
Prior distribution export — prior_distributions.json is written in the model-fit stage so authored and resolved Meridian prior distributions are captured with run artefacts.
Config templates — Commented starter configs under templates/ cover minimal, standard, media-prior, and blocked-tail validation workflows.
Geo panel demo artefacts — Stored geo_panel demo run outputs now live under runs/demos/, so reference artefacts cover both bundled pipeline shapes.
Meridian compatibility inventory — Project documentation now records the Meridian 1.5.3 files and private sampler seams that must be reviewed before changing the dependency pin.
Validation-spec v2 bindings — Stored validation specs now bind coordinate and data-shape metadata so refresh and rolling-origin execution can fail closed when input data no longer matches the stored contract.
Ephemeral input provenance — Manifest v4 runs record provenance for the exact temporary snapshot loaded by Meridian.
Canonical verification gate — scripts/verify_release.py now runs the constrained release checks for dependency consistency, compile/import health, CLI help, Ruff, mypy, Hugo docs, and the coverage suite.

Changed

Strict run names — Pipeline run names must now match ^[A-Za-z0-9][A-Za-z0-9._-]{0,127}$; path separators, whitespace, Unicode, and relative path segments are rejected before run directories are created.
CSV loader YAML validation — data.coord_to_columns and channel mapping fields now fail earlier with wrapper-owned errors when shapes, keys, or channel ordering are inconsistent with Meridian loader contracts.
ModelSpec array validation — Known boolean array kwargs forwarded through model_spec.kwargs now reject scalars, ragged lists, strings, and numeric stand-ins before Meridian model construction; their ranks and loaded-data shapes are also checked before model construction.
Model-selection resilience — Optional LOO/WAIC exports now capture ArviZ warnings and degrade to explicit status artifacts for unexpected ArviZ or artifact-write failures instead of failing the whole run.
Bounded rolling-origin execution — Each split fits only through its test boundary and records the exact validation window used for execution.
Manifest v4 completion integrity — Completed runs validate required manifest artefacts as existing, contained regular files before the final manifest is written.
Release documentation — Architecture, schema, release-baseline, and generated Hugo documentation now describe manifest v4, validation-spec v2, bounded reproducibility, and the canonical constrained gate.

Fixed

Launcher empty-result handling — The source-checkout launcher now raises a clear runtime error if the pipeline unexpectedly returns no run result.
Run directory collisions — Same-second runs with the same logical name now use deterministic suffixes such as _001 instead of failing on an existing directory.
Private destination JSON writes — Newly created destination JSON files now use private file modes.
Temporary snapshot cleanup — Temporary Meridian input snapshots are removed after handoff while hashed provenance for the loaded bytes is kept.
Finite numeric configuration — Non-finite numeric, analysis, and nested Meridian configuration values are rejected before they reach Meridian.
Repo-owned Hugo configuration — The repository-owned Hugo language setting no longer emits a deprecation warning on the supported Hugo line.
Protobuf compatibility bound — The supported dependency boundary now constrains protobuf<7 because Meridian 1.5.3 serialises boolean TensorFlow Probability parameters through an int proto path that protobuf 7 rejects.

[0.3.0] — 2026-04-24

Changed

CLI single source of truth — runme.py now delegates directly to meridian_tools.cli, removing duplicate root-level argument parsing.
Typed runner state — Pipeline orchestration now uses PipelineContext for shared stage state.
Shared posterior sampling — Runner posterior sampling keyword mapping is centralized in one helper.
Lifecycle comparison schema — Run comparison rows are generated from declarative comparison field descriptors.
Meridian compatibility pin — The package pins google-meridian[schema]==1.5.3, and log-likelihood reconstruction refuses unvalidated Meridian versions.
Static analysis tooling — Development extras now include mypy, and Ruff enables additional complexity, simplification, and Ruff-specific rule families.

Fixed

Optimized Python safety — Validation helpers now use explicit exceptions instead of assert for runtime invariants.
Shared confidence validation — Response curve and optimisation configs share one confidence_level validator.
Export coercion documentation — NetCDF attribute coercion now documents its input-to-output type mapping.

[0.2.0] — 2026-04-07

Added

Docs site build — Hugo-based website documentation under docs-site/, generated from the repository Markdown set by docs-site/build_content.py.
Manifest v3 provenance — Explicit input_data_provenance capture for stored runs and lifecycle refresh or compare workflows.
Typed failure boundaries — ConfigPreflightError, ValidationExecutionContractError, and PipelineRunFailure distinguish wrapper-owned preflight, validation contract misuse, and post-directory runtime failures.
Bounded live verification — An opt-in Meridian real-fit smoke route gated behind MERIDIAN_TOOLS_ENABLE_REAL_FIT=1.
Module-path CLI contract — Explicit support and regression coverage for python -m meridian_tools.cli ....

Changed

Shared launch flow — meridian-tools and the repo-root runme.py launcher now share one launch flow for config loading, preflight checks, progress reporting, and terminal success or failure output.
Packaged demo assets — Bundled demo configs and datasets are resolved from packaged _demo_data, so demo runs work from installed wheels as well as source checkouts.
Default demo fit mode — Bundled demos now default to full-sample fits (validation.strategy: none), so loo_summary.json and waic_summary.json are generated by default and 10_validation is recorded as skipped.
Refresh contract — Stored-run refresh now reloads from the saved resolved config while preserving the original source config copy in run metadata.
Lifecycle compare semantics — Compare now distinguishes legacy runs without dataset provenance from real dataset changes.
Documentation layout — Public documentation is reorganised under docs/ into getting-started, guides, reference, concepts, and project sections.

Fixed

Structured public entrypoint failures — Missing or invalid config paths in public entrypoints now produce structured failure output instead of raw Python tracebacks unless --traceback is used.
Relative-path refresh — Refreshing a stored run with relative data.path input no longer depends on the original source config location remaining present on disk.
Partial-run failure reporting — Failed runs that already created an output directory now report the concrete run directory, manifest path, and failing stage through the CLI and runme.py.
Docs-site theme resolution — Hugo builds resolve the Relearn theme through a pinned module dependency instead of requiring a local theme checkout.

[0.1.0] — 2026-04-02

Added

Typed YAML configuration — Pydantic-validated config with extra="forbid" strictness for all sections: project, data, model_spec, fit, validation, exports, response_curves, optimisation.
Staged pipeline runner — Sequential execution through 00_run_metadata, 10_validation, 20_model_fit, 30_model_assessment, 40_decomposition, 60_response_curves, 70_optimisation with manifest persistence after each stage.
Validation orchestration — blocked_tail and rolling_origin time-series validation strategies with auto-generated holdout masks. Authored holdout passthrough through model_spec.kwargs.holdout_id.
Diagnostics bundling — diagnostics_bundle.json manifest with optional predictive_accuracy.csv and review_summary.json exports.
Bayesian model selection — Compatibility-aware LOO and WAIC computation through ArviZ, with automatic log-likelihood reconstruction for fitted Meridian models. Graceful degradation for incompatible runs through structured ModelSelectionError with reason codes.
Response curves export — Configurable spend multiplier grid with NetCDF and CSV outputs.
Optimisation export — Fixed-budget and relative-budget optimisation with full artefact set including allocation charts.
Plot exports — PNG plot artefacts through Altair/vl-convert for model fit, diagnostics, decomposition, response curves, and optimisation stages.
Lifecycle management — load_run_record, list_run_records, build_refresh_run_config, compare_run_records for post-run analysis and reproducible refresh workflows.
CLI — meridian-tools run and meridian-tools demo subcommands with lightweight imports for fast startup.
Bundled demos — timeseries and geo_panel reference workflows with packaged data and configs.
Manifest versioning — Support for manifest versions 0, 1, and 2 with backward-compatible deserialisation.
Comprehensive test suite — 218 tests across 15 test files covering configuration, validation, pipeline execution, exports, diagnostics, model selection, lifecycle, and demos.

Meridian Compatibility Inventory

This project currently targets google-meridian[schema]==1.5.3. The local reference checkout used for compatibility review is:

/home/user/Documents/GITHUB/tandpds/meridian

The 0.4.0 release-candidate environment also constrains protobuf>=5.28.0,<7. Meridian 1.5.3 serialises boolean TensorFlow Probability distribution parameters through an int proto path. Protobuf 7 rejects that path, while the constrained protobuf==6.33.6 line preserves the validated Meridian schema save/load smoke route with deprecation warnings.

Reviewed Reference Files

meridian/version.py
meridian/data/load.py
meridian/model/spec.py
meridian/model/context.py
meridian/model/model.py
meridian/model/posterior_sampler.py

Required Symbols And Contracts

meridian.version.__version__ is 1.5.3.
protobuf remains below 7 for Meridian schema serialisation.
meridian.data.load.CoordToColumns defines the CSV coordinate mapping surface.
meridian.data.load.CsvDataLoader is the CSV input loader used by meridian-tools.
meridian.model.spec.ModelSpec accepts wrapper-authored kwargs, including holdout_id and calibration/scaling arrays.
ModelContext.holdout_id validates national holdout masks as (n_times,) and geo holdout masks as (n_geos, n_times).
Meridian.posterior_sampler_callable returns a posterior sampler with the private reconstruction seams required by the log-likelihood adapter.
PosteriorMCMCSampler._get_joint_dist_unpinned exists.
PosteriorMCMCSampler._prepare_latents_for_reconstruction exists.
PosteriorMCMCSampler._reconstruct_posteriors exists.

Upgrade Checklist

Before changing the pinned Meridian dependency:

Compare the reviewed files above against the new Meridian version.
Confirm the private posterior sampler seam methods still exist and preserve compatible behavior.
Run python scripts/verify_release.py.
Run MERIDIAN_TOOLS_ENABLE_REAL_FIT=1 python -m pytest -q -m real_fit.
Recheck the protobuf compatibility bound if Meridian schema serialisation changes upstream.
Update this inventory and release notes with any compatibility changes.

Meridian Tools documentation

Getting started

Guides

Reference

Python API

Concepts

Project

Subsections of Meridian Tools documentation

Getting started

Pages

Subsections of Getting started

Installation

Prerequisites

Install Meridian first

Install meridian-tools

From the source tree (recommended for development)

Editable install without dev extras

Verify the install

Runtime dependencies

Development extras

Troubleshooting

Quickstart

1. Run a bundled demo

2. Inspect the run directory

3. Read the key outputs

4. Run your own config

5. Next steps

Guides

Pages

Subsections of Guides

Configuration guide

Configuration philosophy

Minimal valid config

Templates

Section reference

project

data

model_spec

Custom Media Priors

fit

validation

exports

response_curves

optimisation

Validation strictness

Path resolution

Wrapper-owned preflight

Full example

Validation guide

Why validation matters for MMM

Validation strategies

none — No validation

blocked_tail — Single contiguous tail holdout

rolling_origin — Expanding-window validation

authored_holdout — User-provided holdout mask

CLI vs Python API

Blocked tail from the CLI

Rolling origin requires the Python API

Separating validation from the final fit

Final fit after blocked tail

Final fit after rolling origin

Run directory naming

Validation spec artefact

Interaction with model selection

Model selection guide

What model selection provides

Compatibility boundary

How it works in the pipeline

Compatible runs

Unavailable or degraded runs

Using the Python API directly

Compute LOO for a single model

Compute WAIC for a single model

Compare multiple models

Worked comparison

Check log-likelihood availability

Log-likelihood reconstruction

Interpreting the outputs

LOO summary

WAIC summary

`project`

`data`

`model_spec`

`fit`

`validation`

`exports`

`response_curves`

`optimisation`

`none` — No validation

`blocked_tail` — Single contiguous tail holdout

`rolling_origin` — Expanding-window validation

`authored_holdout` — User-provided holdout mask

`meridian-tools --help` fails with ImportError

`RuntimeError: Saving meridian_model.binpb requires Meridian schema support`

`RuntimeError: Saving PNG plots requires vl-convert-python`

`pydantic.ValidationError: Extra inputs are not permitted`

`Legacy holdout_size shorthand is no longer supported`

`validation.strategy: blocked_tail does not accept rolling-origin parameters`

`optimisation.end_date must be on or after optimisation.start_date`

`response_curves.spend_multipliers must not be empty`

`ConfigPreflightError`

`ValidationExecutionContractError`

`ModelSelectionError` with `reason_code: holdout_fit_unsupported`

`ModelSelectionError` with `reason_code: meridian_internal_seam_incompatible`

`time_index must be strictly increasing with no duplicate values`

`rolling_origin must yield at least two splits`

`holdout_size must be smaller than the time axis`

`LifecycleError` when loading a run record