Meridian Tools documentation

Companion tooling for Google Meridian MMM workflows. This documentation covers installation, configuration, validation strategies, model selection, lifecycle management, and the full API reference.

Getting started

Guides

Task-oriented how-to guides for common workflows.

Reference

Lookup-oriented documentation for precise details.

Python API

Concepts

Background explanations for architecture and design choices.

Project

Contributor and governance documentation.

Subsections of Meridian Tools documentation

Getting started

Install meridian-tools, run a demo, and get to the first staged output quickly.

Pages

  • Installationmeridian-tools requires Python 3.11 or later and a working installation of Google Meridian with schema support.
  • Quickstart — This guide takes you from a fresh install to your first completed run in under five minutes using the bundled demo data.

Subsections of Getting started

Installation

Prerequisites

meridian-tools requires Python 3.11 or later and a working installation of Google Meridian with schema support.

Install Meridian first

Meridian is the upstream modelling engine. Install it before meridian-tools:

pip install "google-meridian[schema]==1.5.3"

If you are working from a local Meridian checkout:

pip install -e "/path/to/meridian[schema]"

Verify the install:

from meridian import version
print(version.__version__)

Install meridian-tools

cd /path/to/meridian-tools
pip install -e ".[dev]"

The [dev] extra installs pytest, ruff, and mypy for running the test suite and linter.

Editable install without dev extras

pip install -e .

Verify the install

meridian-tools --help

You should see the CLI help output listing the run and demo subcommands. This command is deliberately lightweight — it does not import TensorFlow, NumPy, or Meridian.

You can also verify in Python:

import meridian_tools
print(meridian_tools.__version__)

Runtime dependencies

The following are declared in pyproject.toml and installed automatically:

Package Version bound
google-meridian[schema] ==1.5.3
arviz >=0.18.0, <0.20.0
pandas >=2.2.0, <3
pydantic >=2.8.0, <3
PyYAML >=6.0.0, <7
vl-convert-python >=1.7.0, <2

TensorFlow is not a direct dependency of meridian-tools. It comes transitively through google-meridian.

Development extras

pip install -e ".[dev]"

This adds:

Package Purpose
pytest Test runner
ruff Linter and formatter

Troubleshooting

If meridian-tools --help fails with an import error, check that:

  1. You are in the correct virtual environment.
  2. Meridian is installed with the [schema] extra.
  3. Python version is 3.11 or later: python --version.

If pip install -e . fails, ensure setuptools>=68.0.0 is available:

pip install --upgrade setuptools

See the troubleshooting guide for more common issues.

Quickstart

This guide takes you from a fresh install to your first completed run in under five minutes using the bundled demo data.

1. Run a bundled demo

List the available demos:

meridian-tools demo --list

Output:

timeseries
geo_panel

Run the timeseries demo:

meridian-tools demo timeseries

When run from the source checkout, this creates a dated run directory under runs/demos/. When run from an installed package, the default output root is ./runs/demos/ relative to your current working directory. Each demo produces a full staged output layout.

2. Inspect the run directory

After the demo completes, find the created run directory:

ls runs/demos/

You will see a directory like demo-timeseries_20260402_073500/. The name comes from the demo’s project.name (demo-timeseries) plus a timestamp. The bundled demos now default to full-sample fits, so LOO and WAIC outputs are available in the assessment stage by default. Inside:

demo-timeseries_20260402_073500/
  run_manifest.json
  00_run_metadata/
    config.source.yaml
    config.resolved.yaml
  20_model_fit/
    meridian_model.binpb
    fit_metadata.json
  30_model_assessment/
    diagnostics_bundle.json
    model_results_summary.html
    loo_summary.json
    waic_summary.json
  40_decomposition/
    summary_metrics.csv
    summary_metrics.nc
  60_response_curves/
    response_curves.csv
    response_curves.nc
  70_optimisation/
    optimisation_summary.html
    optimised_data.csv

3. Read the key outputs

Start with the manifest:

cat runs/demos/demo-timeseries_*/run_manifest.json | python -m json.tool | head -20

Check the diagnostics bundle:

cat runs/demos/demo-timeseries_*/30_model_assessment/diagnostics_bundle.json | python -m json.tool

View the model results summary by opening the HTML file in your browser:

# Linux
xdg-open runs/demos/demo-timeseries_*/30_model_assessment/model_results_summary.html

# macOS
open runs/demos/demo-timeseries_*/30_model_assessment/model_results_summary.html

Inspect the decomposition CSV:

head runs/demos/demo-timeseries_*/40_decomposition/summary_metrics.csv

4. Run your own config

Create a YAML config file (e.g. project.yml):

project:
  name: my-first-run

data:
  path: ./my_data.csv
  kpi_type: revenue
  coord_to_columns:
    time: week
    geo: market
    kpi: revenue
    media: [impressions_tv, impressions_search]
    media_spend: [spend_tv, spend_search]

fit:
  n_chains: 4
  n_adapt: 500
  n_burnin: 500
  n_keep: 1000
  seed: 42

validation:
  strategy: blocked_tail
  holdout_size: 8

exports:
  export_predictive_accuracy: true
  export_review_summary: true
  export_model_selection: true

Run it:

meridian-tools run --config project.yml --output-dir runs

5. Next steps

Guides

Task-oriented workflow documentation for configuration, validation, demos, lifecycle, and troubleshooting.

Pages

  • Configuration guidemeridian-tools is driven by one YAML configuration file. This guide explains every section, its purpose, and its constraints. For a field-level schema reference, see yaml-schema.md.
  • Validation guide — This guide explains how to choose and configure validation strategies in meridian-tools. Validation is the process of evaluating a candidate model specification on held-out data before committing to a final production fit.
  • Model selection guide — This guide explains how meridian-tools supports Bayesian model selection using Leave-One-Out (LOO) cross-validation and the Watanabe-Akaike Information Criterion (WAIC). It covers when model selection is available, how to interpret the outputs, and how to compare multiple candidate models.
  • Lifecycle management guidemeridian-tools treats completed runs as immutable artefacts. The lifecycle module provides tools to load, compare, and refresh past runs without mutating them. This guide explains each lifecycle operation and when to use it.
  • Meridian Tools workflow guide — This guide shows the supported end-to-end agency workflow for meridian-tools. It starts with one YAML config, moves through candidate validation, separates the final full-sample fit from the validation runs, and ends with the artefacts you should hand over or inspect later. The examples in this guide stay inside the implemented package surface. They do not assume notebooks, dashboards, or unpublished helper scripts.
  • Meridian Tools demo guide — This is the canonical guide to the bundled meridian-tools demos. Use it when you want one safe, reproducible, end-to-end example without client data.
  • Troubleshooting — Common issues and solutions when working with meridian-tools.

Subsections of Guides

Configuration guide

meridian-tools is driven by one YAML configuration file. This guide explains every section, its purpose, and its constraints. For a field-level schema reference, see yaml-schema.md.

Configuration philosophy

The YAML file owns the authored project definition: project metadata, data paths, model specification, fit settings, validation strategy, and export switches. Runtime-only values — output_dir, run_name, and concrete validation_spec — belong in PipelineRunConfig or CLI flags, not in the YAML file. This separation ensures that the same YAML file can drive multiple runs with different runtime options while remaining reproducible.

Minimal valid config

project:
  name: my-project

data:
  path: ./data.csv
  coord_to_columns:
    time: week

This is the smallest config that will pass validation. It uses defaults for everything else: no validation, all exports enabled, no response curves, no optimisation.

Section reference

project

Top-level project metadata.

project:
  name: client-mmm        # Default: "meridian-project"
  • name — Human-readable project name. Used as the base for run directory names unless overridden by --run-name at runtime.

data

CSV data loader configuration. Maps directly to Meridian’s CsvDataLoader.

data:
  path: ./client_dataset.csv
  kpi_type: revenue                    # "revenue" (default) or "non-revenue"
  coord_to_columns:
    time: week
    geo: market                        # optional for national models
    kpi: revenue
    population: population
    media: [impressions_tv, impressions_search]
    media_spend: [spend_tv, spend_search]
    controls: [promo_flag, price_index]
  media_to_channel: null               # optional channel mapping overrides
  media_spend_to_channel: null
  reach_to_channel: null
  frequency_to_channel: null
  rf_spend_to_channel: null
  organic_reach_to_channel: null
  organic_frequency_to_channel: null
  • path — Path to the CSV data file. Relative paths are resolved against the directory containing the YAML config file, not the current working directory.
  • kpi_type — Either "revenue" or "non-revenue". Controls how Meridian interprets the KPI column.
  • coord_to_columns — Maps Meridian coordinate names to CSV column names. time is required. geo is optional (omit for national models).

model_spec

Raw keyword arguments forwarded to Meridian’s ModelSpec.

model_spec:
  kwargs:
    max_lag: 8
    media_prior_type: roi
  • kwargs — Dictionary passed through to ModelSpec(**kwargs). Supports any argument that Meridian’s ModelSpec accepts.
  • Special handling for holdout_id: if present in kwargs, the run is treated as an “authored holdout” validation run. See the validation guide for details.

fit

Sampling configuration for Meridian posterior fitting.

fit:
  sample_prior_draws: null     # Optional prior-only sampling
  n_chains: 4                  # Number of MCMC chains
  n_adapt: 500                 # Adaptation steps per chain
  n_burnin: 500                # Burn-in steps per chain
  n_keep: 1000                 # Posterior samples to keep per chain
  seed: 20260331               # Reproducibility seed (int, list[int], or null)
  max_tree_depth: 10           # NUTS max tree depth
  max_energy_diff: 500.0       # NUTS max energy difference
  unrolled_leapfrog_steps: 1   # NUTS leapfrog steps
  parallel_iterations: 10      # TF parallel iterations

All fields have sensible defaults. Override only what you need.

  • seed — Accepts a single integer, a list of integers (one per chain), or null for non-deterministic sampling.
  • sample_prior_draws — If set, prior predictive samples are drawn before posterior sampling. This is optional and primarily for model diagnostics.

validation

Validation and holdout orchestration settings. See the validation guide for strategy selection advice.

# Option 1: No validation (default)
validation:
  strategy: none

# Option 2: Blocked tail
validation:
  strategy: blocked_tail
  holdout_size: 8

# Option 3: Rolling origin
validation:
  strategy: rolling_origin
  initial_train_size: 52
  test_size: 4
  step_size: 4          # Must equal test_size
  max_splits: 3         # At least 2
  • strategy — One of "none", "blocked_tail", or "rolling_origin".
  • holdout_size — Required for blocked_tail. Number of time periods to hold out from the end of the series.
  • initial_train_size, test_size — Required for rolling_origin.
  • step_size — Optional for rolling_origin. Must equal test_size if set. Defaults to test_size.
  • max_splits — Optional for rolling_origin. Must be at least 2.

Validation rules:

  • blocked_tail rejects rolling-origin parameters.
  • rolling_origin rejects holdout_size.
  • none rejects all holdout and rolling-origin parameters.
  • Legacy holdout_size without explicit strategy is rejected.

exports

Output switches for diagnostics and model-selection artefacts.

exports:
  use_kpi: false                       # Use KPI-based metrics
  batch_size: 1000                     # Batch size for Meridian analysis
  export_predictive_accuracy: true     # Write predictive_accuracy.csv
  export_review_summary: true          # Write review_summary.json
  export_model_selection: true         # Write LOO/WAIC outputs
  export_plots: true                   # Write PNG plot artefacts

All fields have defaults. If the entire exports section is omitted, all exports are enabled with default settings.

response_curves

Optional. If omitted, the response curves stage is skipped.

response_curves:
  spend_multipliers: [0.0, 0.5, 1.0, 1.5, 2.0]
  use_posterior: true
  by_reach: true
  use_optimal_frequency: false
  confidence_level: 0.9
  • spend_multipliers — Required. Non-empty list of non-negative floats.
  • confidence_level — Must be strictly between 0 and 1.

optimisation

Optional. If omitted, the optimisation stage is skipped.

optimisation:
  start_date: "2025-01-01"
  end_date: "2025-12-31"
  budget:
    mode: fixed_total                  # or "relative_reference_window_total"
    value: 1000000.0
  use_posterior: true
  use_optimal_frequency: true
  confidence_level: 0.9
  • start_date, end_date — ISO format YYYY-MM-DD. end_date must be on or after start_date.
  • budget.mode — Either "fixed_total" (absolute budget) or "relative_reference_window_total" (multiplier against the reference window’s total spend).
  • budget.value — Positive float. For fixed_total, this is the absolute budget. For relative_reference_window_total, this is a multiplier (e.g. 1.1 means 110% of the reference window total).

Validation strictness

All configuration models use Pydantic’s extra="forbid" mode. Any unexpected key in the YAML file will produce a clear validation error. This prevents silent misconfiguration from typos or outdated keys.

$ meridian-tools run --config bad.yml
# pydantic.ValidationError: 1 validation error for MeridianToolsConfig
# exports -> export_pridictive_accuracy
#   Extra inputs are not permitted

Path resolution

Relative paths in data.path are resolved against the directory containing the YAML config file, not the current working directory. This means:

# If config is at /workspace/configs/project.yml
data:
  path: ../inputs/weekly.csv
# Resolves to /workspace/inputs/weekly.csv

The resolved path is written to config.resolved.yaml in the run directory. The original authored path is preserved in config.source.yaml.

Wrapper-owned preflight

Before meridian-tools creates a dated run directory, it performs one narrow wrapper-owned preflight check on the authored config and the resolved input CSV. Phase 10 keeps this boundary intentionally small so the wrapper does not become a second Meridian schema layer.

The wrapper checks exactly:

  • the resolved data.path exists and is a regular file
  • the CSV header row can be read
  • the parsed header is non-empty
  • no parsed header cell is blank after trimming whitespace
  • every authored scalar entry in data.coord_to_columns exists in the header
  • every authored list member in data.coord_to_columns exists in the header
  • every authored key in data.media_to_channel exists in the header
  • every authored key in data.media_spend_to_channel exists in the header
  • every authored key in data.reach_to_channel exists in the header
  • every authored key in data.frequency_to_channel exists in the header
  • every authored key in data.rf_spend_to_channel exists in the header
  • every authored key in data.organic_reach_to_channel exists in the header
  • every authored key in data.organic_frequency_to_channel exists in the header
  • authored list-valued coord families are non-empty
  • authored mapping fields above are non-empty
  • coord_to_columns.media and media_to_channel must be authored together
  • coord_to_columns.media_spend and media_spend_to_channel must be authored together
  • coord_to_columns.reach, coord_to_columns.frequency, reach_to_channel, and frequency_to_channel must be authored together
  • coord_to_columns.rf_spend and rf_spend_to_channel must be authored together
  • coord_to_columns.organic_reach and organic_reach_to_channel must be authored together
  • coord_to_columns.organic_frequency and organic_frequency_to_channel must be authored together

Matching is exact and case-sensitive. The wrapper does not normalise headers, apply aliases, or use fuzzy matching.

What remains Meridian-owned:

  • deep ModelSpec semantics
  • fit-dependent tensor or shape constraints
  • statistical validity checks that depend on model construction or sampling

So Phase 10 moves obvious wrapper-detectable mistakes earlier, but it does not promise to catch everything Meridian may reject later.

Full example

project:
  name: client-mmm

data:
  path: ./client_dataset.csv
  kpi_type: revenue
  coord_to_columns:
    time: week
    geo: market
    kpi: revenue
    population: population
    media: [impressions_tv, impressions_search]
    media_spend: [spend_tv, spend_search]
    controls: [promo_flag, price_index]

model_spec:
  kwargs:
    max_lag: 8
    media_prior_type: roi

fit:
  n_chains: 4
  n_adapt: 500
  n_burnin: 500
  n_keep: 1000
  seed: 20260331

validation:
  strategy: blocked_tail
  holdout_size: 8

exports:
  export_predictive_accuracy: true
  export_review_summary: true
  export_model_selection: true

response_curves:
  spend_multipliers: [0.0, 0.5, 1.0, 1.5, 2.0]
  use_posterior: true
  by_reach: true
  use_optimal_frequency: false
  confidence_level: 0.9

optimisation:
  start_date: "2025-01-01"
  end_date: "2025-12-31"
  budget:
    mode: fixed_total
    value: 1000000.0
  use_posterior: true
  use_optimal_frequency: true
  confidence_level: 0.9

Validation guide

This guide explains how to choose and configure validation strategies in meridian-tools. Validation is the process of evaluating a candidate model specification on held-out data before committing to a final production fit.

Why validation matters for MMM

Marketing Mix Models are fitted to time series data. Unlike standard supervised learning, the temporal structure of the data means that naive IID cross-validation (random train/test splits) is statistically inappropriate. meridian-tools does not implement random shuffling or naive k-fold splits. Instead, it provides two time-respecting validation strategies and a clear separation between validation runs and the final production fit.

Validation strategies

none — No validation

validation:
  strategy: none

The model is fitted on the full dataset with no holdout. Use this when you do not need candidate evaluation — for example, when rerunning a previously validated specification.

blocked_tail — Single contiguous tail holdout

validation:
  strategy: blocked_tail
  holdout_size: 8

Reserves the last holdout_size time periods as a test block. The model is fitted on all preceding periods. This is the recommended default for short MMM time series where you want one simple candidate evaluation.

When to use: Most standard MMM projects with fewer than 150 weekly observations.

How it works:

Time axis: [t1, t2, t3, t4, t5, t6, t7, t8, t9, t10]
holdout_size: 3

Train: [t1, t2, t3, t4, t5, t6, t7]
Test:  [t8, t9, t10]

The holdout mask is generated automatically and injected into Meridian’s holdout_id parameter. For geo-panel models, the mask is broadcast across all geos.

rolling_origin — Expanding-window validation

validation:
  strategy: rolling_origin
  initial_train_size: 52
  test_size: 4
  step_size: 4
  max_splits: 3

Creates multiple expanding-window splits where each successive split adds more training data. This provides a more robust evaluation signal than a single blocked tail, but requires enough history to support multiple splits.

When to use: Projects with longer time series (typically 100+ weekly observations) where you want multiple evaluation windows.

How it works:

Time axis: [t1, t2, ..., t52, t53, ..., t56, t57, ..., t60]

Split 1: Train [t1..t52], Test [t53..t56]
Split 2: Train [t1..t56], Test [t57..t60]

Constraints:

  • step_size must equal test_size (non-overlapping test windows).
  • max_splits must be at least 2.
  • initial_train_size + test_size must not exceed the number of observations.
  • The plan must yield at least two splits.

authored_holdout — User-provided holdout mask

This is not a YAML strategy setting. Instead, you provide holdout_id directly in model_spec.kwargs:

model_spec:
  kwargs:
    holdout_id: [false, false, false, true, true]

When the runner detects an authored holdout_id in the YAML, it treats the run as an authored_holdout validation run. The mask is passed through to Meridian verbatim and recorded in the validation spec artefact.

When to use: When you need a specific holdout pattern that does not follow blocked-tail or rolling-origin conventions.

CLI vs Python API

Blocked tail from the CLI

blocked_tail runs directly from the CLI because they produce one run:

meridian-tools run --config project.yml --output-dir runs

Rolling origin requires the Python API

rolling_origin is a Python-first planning surface because it produces multiple runs — one per split plus a final fit. The CLI will reject direct rolling_origin execution:

# This will fail:
meridian-tools run --config project.yml  # with strategy: rolling_origin
# ValueError: cannot execute `rolling_origin` directly

Instead, use the Python API:

from pathlib import Path

import pandas as pd

from meridian_tools.config import PipelineRunConfig, load_yaml_config
from meridian_tools.cv import build_validation_plan
from meridian_tools.runner import run_pipeline

config_path = Path("project.yml")
config = load_yaml_config(config_path)

# Read the time index from your data
data_path = config.data.path
if not data_path.is_absolute():
    data_path = (config_path.parent / data_path).resolve()

frame = pd.read_csv(data_path)
time_column = config.data.coord_to_columns["time"]
geo_column = config.data.coord_to_columns.get("geo")

time_index = frame[time_column].drop_duplicates().tolist()
geo_index = None
if geo_column is not None:
    geo_index = frame[geo_column].drop_duplicates().tolist()

# Build the validation plan
validation_plan = build_validation_plan(
    config.validation,
    time_index=time_index,
    geo_index=geo_index,
)

# Execute each validation split
for run_spec in validation_plan.validation_runs:
    run_pipeline(
        PipelineRunConfig(
            config_path=config_path,
            output_dir=Path("runs"),
            validation_spec=run_spec,
        )
    )

Separating validation from the final fit

Validation runs and the final production fit are different jobs. First you evaluate candidate specifications on held-out splits. Then, once you have chosen the specification, you run a separate full-sample fit with no holdout.

Do not reuse a validation fit as the production artefact. The validation fit was trained on a subset of the data and its posterior reflects that subset.

Final fit after blocked tail

For blocked_tail, build_validation_plan provides a final_fit_run spec:

validation_plan = build_validation_plan(config.validation, time_index, geo_index)

# Run the final fit on all data
final_result = run_pipeline(
    PipelineRunConfig(
        config_path=config_path,
        output_dir=Path("runs"),
        validation_spec=validation_plan.final_fit_run,
    )
)

Final fit after rolling origin

The same pattern works for rolling origin:

# After running all validation splits...
final_result = run_pipeline(
    PipelineRunConfig(
        config_path=config_path,
        output_dir=Path("runs"),
        validation_spec=validation_plan.final_fit_run,
    )
)

The final_fit_run spec has mode="final_fit", strategy="none", and holdout_id=None. It trains on the full time axis with no holdout.

Run directory naming

The runner automatically appends a validation-aware suffix to the run name:

Scenario Run name pattern
No validation <project_name>_<timestamp>
Blocked tail <project_name>_blocked_tail_<timestamp>
Rolling origin split 1 <project_name>_split_01_<timestamp>
Final fit <project_name>_final_fit_<timestamp>
Authored holdout <project_name>_authored_holdout_<timestamp>

Override the name with --run-name or PipelineRunConfig(run_name=...).

Validation spec artefact

Every validation-aware run writes a validation_spec.json artefact in the 10_validation/ stage directory. This JSON records:

  • mode"validation" or "final_fit"
  • strategy — the validation strategy used
  • split_label — human-readable split identifier
  • holdout_source"generated_validation", "authored_model_spec", or "none"
  • generated_holdout — whether the holdout mask was auto-generated
  • holdout_shape — shape of the holdout mask (without the actual data)
  • train_indices / test_indices — integer indices into the time axis
  • train_dates / test_dates — corresponding date values

The actual holdout mask is not stored in the JSON artefact (it can be large). It is injected into the model at runtime.

Interaction with model selection

Bayesian model selection (LOO/WAIC) is only available for runs where holdout_id is None — meaning full-sample fitted models and final-fit runs. Validation fits and authored-holdout runs write a model_selection_status.json artefact instead of LOO/WAIC outputs. See the model selection guide for details.

Model selection guide

This guide explains how meridian-tools supports Bayesian model selection using Leave-One-Out (LOO) cross-validation and the Watanabe-Akaike Information Criterion (WAIC). It covers when model selection is available, how to interpret the outputs, and how to compare multiple candidate models.

What model selection provides

Bayesian model selection uses information criteria computed from pointwise log-likelihood values to compare model specifications. Unlike predictive accuracy on a held-out set, LOO and WAIC evaluate the model’s expected predictive performance using the full posterior without requiring a separate validation split.

meridian-tools wraps ArviZ’s az.loo and az.waic with:

  • Automatic log-likelihood reconstruction for fitted Meridian models
  • Structured error handling when model selection is not possible
  • A compare_models surface for ranking multiple candidates
  • Artefact-level compatibility status in every run directory

Compatibility boundary

Model selection is only available for models where holdout_id is None. This means:

Run type Model selection available
Full-sample fit (no validation) Yes
Final-fit run (mode: final_fit) Yes
Blocked-tail validation run No
Rolling-origin validation split No
Authored-holdout run No
Bare InferenceData without log_likelihood No

This restriction exists because LOO and WAIC require the full observed likelihood surface. A holdout fit has a modified likelihood that does not represent the full data generating process. Comparing a holdout fit’s ELPD against a full fit’s ELPD would be statistically meaningless.

How it works in the pipeline

When exports.export_model_selection: true in the YAML config, the runner’s 30_model_assessment stage attempts model selection after writing diagnostics.

Compatible runs

For compatible models, the stage writes:

  • loo_summary.json — LOO summary statistics (ELPD, p_loo, SE, etc.)
  • waic_summary.json — WAIC summary statistics
  • loo_pointwise.csv — Per-observation LOO values and Pareto k diagnostics
  • waic_pointwise.csv — Per-observation WAIC values
  • model_comparison.csv — Ranked comparison table (single-model for individual runs)

Incompatible runs

For incompatible models, the stage writes a single status artefact:

  • model_selection_status.json
{
  "status": "unavailable",
  "reason_code": "holdout_fit_unsupported",
  "reason": "Model selection requires holdout_id is None ..."
}

Known reason codes:

Code Meaning
holdout_fit_unsupported The model was fitted with a holdout mask
requires_fitted_meridian_model Missing posterior samples or ArviZ InferenceData
missing_log_likelihood_group Bare InferenceData without reconstructable likelihood
meridian_internal_seam_incompatible Meridian version lacks required internal reconstruction methods

Incompatibility is non-fatal. The pipeline completes successfully and records the reason in the artefact.

Using the Python API directly

Compute LOO for a single model

from meridian_tools.model_selection import compute_loo

result = compute_loo(fitted_model, pointwise=True)

print(result.kind)          # "loo"
print(result.summary)       # {"kind": "loo", "elpd_loo": -123.4, ...}
print(result.pointwise)     # DataFrame with loo_i, pareto_k per observation

Compute WAIC for a single model

from meridian_tools.model_selection import compute_waic

result = compute_waic(fitted_model, pointwise=True)

print(result.kind)          # "waic"
print(result.summary)       # {"kind": "waic", "elpd_waic": -125.1, ...}

Compare multiple models

from meridian_tools.model_selection import compare_models

comparison = compare_models(
    {
        "model_a": fitted_model_a,
        "model_b": fitted_model_b,
    },
    ic="loo",   # or "waic"
)

print(comparison)
# DataFrame with columns: model, rank, elpd_loo, p_loo, elpd_diff, weight, se, dse, warning, scale

The comparison table is ranked by ELPD. The best model has rank 0 and elpd_diff == 0. The weight column gives stacking weights.

Check log-likelihood availability

from meridian_tools.model_selection import has_log_likelihood

if has_log_likelihood(fitted_model):
    result = compute_loo(fitted_model)

Log-likelihood reconstruction

Meridian does not store pointwise log-likelihood in its InferenceData by default. meridian-tools reconstructs it automatically when you pass a fitted Meridian model to compute_loo, compute_waic, or compare_models.

The reconstruction:

  1. Recovers unsaved posterior parameters (e.g. geo deviations, tau_g)
  2. Rebuilds the joint distribution from the posterior samples
  3. Computes observation-level log-likelihood
  4. Returns a new InferenceData with the log_likelihood group attached

The original model is never mutated. The reconstruction produces a temporary copy used only for the ArviZ computation.

You can also control this explicitly:

from meridian_tools.log_likelihood import attach_log_likelihood

# Returns new InferenceData with log_likelihood group (original unchanged)
idata_with_ll = attach_log_likelihood(fitted_model, in_place=False)

# Mutates the model's inference_data in place
attach_log_likelihood(fitted_model, in_place=True)

Interpreting the outputs

LOO summary

Field Meaning
elpd_loo Expected log pointwise predictive density (higher is better)
p_loo Effective number of parameters
se Standard error of elpd_loo
warning Whether Pareto k diagnostics indicate unreliable estimates

WAIC summary

Field Meaning
elpd_waic Expected log pointwise predictive density (WAIC estimate)
p_waic Effective number of parameters (WAIC estimate)
se Standard error of elpd_waic
warning Whether posterior variance diagnostics indicate unreliable estimates

Pareto k diagnostics

The pointwise LOO output includes a pareto_k column. Values above 0.7 indicate that the LOO approximation is unreliable for those observations. ArviZ will emit a warning if any Pareto k values exceed the threshold.

Model comparison

When comparing two or more models:

  • elpd_diff — Difference in ELPD from the best model (0 for the best)
  • dse — Standard error of the ELPD difference
  • weight — Stacking weight (how much to trust each model)
  • Models are ranked by ELPD (rank 0 is best)

A single-model comparison returns a one-row table with rank=0, elpd_diff=0, and weight=1.0.

Error handling

All model-selection errors are raised as ModelSelectionError with a structured reason_code:

from meridian_tools.model_selection import ModelSelectionError, compute_loo

try:
    result = compute_loo(candidate)
except ModelSelectionError as exc:
    print(exc.reason_code)  # e.g. "holdout_fit_unsupported"
    print(str(exc))         # Human-readable explanation

In the pipeline, these errors are caught and written to model_selection_status.json rather than failing the run.

Lifecycle management guide

meridian-tools treats completed runs as immutable artefacts. The lifecycle module provides tools to load, compare, and refresh past runs without mutating them. This guide explains each lifecycle operation and when to use it.

Core concepts

Run records

A RunRecord encapsulates a run’s metadata and artefact paths. It is loaded from a run directory by reading run_manifest.json and resolving all artefact paths against the directory.

from meridian_tools.lifecycle import load_run_record

record = load_run_record("runs/my-project_blocked_tail_20260402_073500")

print(record.run_dir)                    # Path to the run directory
print(record.manifest)                   # RunManifest with stages, timestamps, versions
print(record.config_source_path)         # Path to config.source.yaml
print(record.config_resolved_path)       # Path to config.resolved.yaml
print(record.input_data_provenance_path) # Path to input_data_provenance.json (or None for older runs)
print(record.diagnostics_bundle_path)    # Path to diagnostics_bundle.json (or None)
print(record.validation_spec_path)       # Path to validation_spec.json (or None)
print(record.model_selection_status_path)  # Path to model_selection_status.json (or None)

All paths in the record are absolute. Required artefacts (config_source, config_resolved) are validated at load time and always present. input_data_provenance is also required for manifest version 3 runs. Optional artefacts (diagnostics_bundle, validation_spec, model_selection_status) are None if not present in the manifest.

Immutability

Lifecycle operations never modify a source run directory. When you refresh a run, the output goes to a new sibling directory. When you compare runs, both source directories remain untouched.

All lifecycle functions raise LifecycleError (a RuntimeError subclass) when they encounter invalid state.

Loading a run record

From a run directory

from meridian_tools.lifecycle import load_run_record

record = load_run_record("runs/my-project_blocked_tail_20260402_073500")

From a manifest path

record = load_run_record("runs/my-project_blocked_tail_20260402_073500/run_manifest.json")

Both forms are accepted. The function detects whether the argument is a directory or a manifest file.

Validation at load time

load_run_record validates:

  • The manifest JSON is well-formed and has a supported version (0, 1, 2, or 3).
  • Required config artefact entries (config_source, config_resolved) exist in the manifest.
  • Manifest version 3 runs also include input_data_provenance.
  • Required artefact files actually exist on disk.
  • No artefact path escapes the run directory (path traversal protection).
  • Claimed optional artefacts exist on disk (a manifest that references a missing file is rejected).

If any check fails, a LifecycleError is raised with a descriptive message.

Listing run records

from meridian_tools.lifecycle import list_run_records

records = list_run_records("runs/")
for record in records:
    print(record.manifest.started_at, record.run_dir.name)

list_run_records discovers all direct child directories that contain a run_manifest.json and returns them sorted by started_at timestamp (most recent first), with run directory name as a secondary sort key.

The function requires a directory path (not a file). It will raise an error if any discovered run directory contains an invalid manifest — it does not silently skip broken runs.

Refreshing a run

Refreshing re-executes a run using its stored configuration but writes the output to a new directory. The source run is never modified.

When to refresh

  • After a Meridian upgrade — to check whether the new version produces comparable results with the same specification.
  • After a code change — to verify that refactoring did not change model outputs.
  • After extending the dataset — to refit the model with additional observations using the same validated specification.

How to refresh

from meridian_tools.lifecycle import build_refresh_run_config
from meridian_tools.runner import run_pipeline

refresh_config = build_refresh_run_config("runs/my-project_blocked_tail_20260402_073500")
result = run_pipeline(refresh_config)

build_refresh_run_config reconstructs a PipelineRunConfig from the source run’s stored configuration:

  • The execution config path points to the source run’s config.resolved.yaml.
  • The source config path points to the source run’s config.source.yaml, so the refreshed run preserves the original authored YAML in its own metadata.
  • The output directory is set to the source run’s parent directory (creating a sibling).
  • The run name suffix is stripped to produce a clean refresh name.
  • For validation runs, the validation spec is reconstructed from the stored validation_spec.json.

Refresh with overrides

You can override specific settings:

from pathlib import Path

refresh_config = build_refresh_run_config(
    "runs/my-project_blocked_tail_20260402_073500",
    output_dir=Path("runs/refreshed"),
    run_name="my-project-refresh",
)

Validation-aware refresh

If the source run was a validation run (blocked tail or rolling origin), build_refresh_run_config reconstructs the validation spec from the stored artefact, including the holdout mask geometry. For authored-holdout runs, it reuses the YAML-owned holdout from the copied config.

For final-fit runs, the refresh produces another final-fit run with the same full-sample training specification.

Comparing runs

from meridian_tools.lifecycle import compare_run_records

comparison = compare_run_records(
    "runs/my-project_blocked_tail_20260402_073500",
    "runs/my-project_blocked_tail_20260415_090000",
)
print(comparison)

compare_run_records accepts run directory paths (not RunRecord objects) and returns a pandas DataFrame with columns field, left, right, status, and changed. The compared fields include:

  • run_name and status — basic identity.
  • meridian_tools_version and meridian_version — version drift.
  • has_validation_spec and has_diagnostics_bundle — artefact presence.
  • predictive_accuracy_status and review_summary_status — diagnostics.
  • has_model_selection_outputs and model_selection_reason_code — model selection.
  • input_authored_path, input_resolved_path, input_sha256, input_size_bytes, input_mtime_utc, input_row_count, input_column_count, and input_ordered_columns — dataset identity and shape.

This is useful for auditing whether a refresh or a specification change produced materially different results.

If either run predates manifest version 3, provenance rows are reported with status == "legacy_unknown" and changed == None. That distinguishes “no stored provenance exists” from “the dataset definitely changed”.

Lifecycle workflow example

A typical lifecycle workflow for a quarterly model refresh:

from pathlib import Path
from meridian_tools.lifecycle import (
    load_run_record,
    list_run_records,
    build_refresh_run_config,
)
from meridian_tools.runner import run_pipeline

# 1. Find the most recent production run
records = list_run_records("runs/")
production_run = records[0]  # Most recent by started_at

# 2. Refresh with the updated dataset
refresh_config = build_refresh_run_config(
    production_run.run_dir,
    output_dir=Path("runs/quarterly-refresh"),
)
refresh_result = run_pipeline(refresh_config)

# 3. Compare the results
comparison = compare_run_records(production_run.run_dir, refresh_result.run_dir)
print(comparison)

Manifest versioning

The lifecycle layer supports manifest versions 0, 1, 2, and 3. Older manifests are handled gracefully with default values for fields that were added in later versions. The current version is 3.

This means you can load run directories created by earlier versions of meridian-tools without issues. The loaded RunRecord keeps the same shape, but input_data_provenance_path is None for pre-v3 runs because those manifests predate provenance capture.

Meridian Tools workflow guide

This guide shows the supported end-to-end agency workflow for meridian-tools. It starts with one YAML config, moves through candidate validation, separates the final full-sample fit from the validation runs, and ends with the artefacts you should hand over or inspect later. The examples in this guide stay inside the implemented package surface. They do not assume notebooks, dashboards, or unpublished helper scripts.

Before you start

Install Meridian first, then install meridian-tools in the same environment:

pip install -e /path/to/meridian
pip install -e ".[dev]"

Use the CLI for ordinary run execution. Use the Python API when you need rolling-origin planning, an explicit final-fit run, or lifecycle compare and refresh operations. Phase 07 does not provide a lifecycle CLI.

If you want packaged reference examples before authoring your own YAML, use the bundled demo guide in demos.md. The packaged demo launcher is meridian-tools demo .... The repo-root python runme.py ... wrapper remains available when you are working from a source checkout.

Author one YAML config

Keep the authored project definition in YAML. Keep runtime-only choices out of the YAML file. In practice, that means your source file owns the project metadata, data path, model specification, fit settings, validation settings, and export switches. Runtime-only values such as output_dir, run_name, and one concrete validation_spec belong in PipelineRunConfig or the CLI call, not in config.resolved.yaml.

Here is one exact blocked-tail config:

project:
  name: client-mmm

data:
  path: ./client_dataset.csv
  kpi_type: revenue
  coord_to_columns:
    time: week
    geo: market
    kpi: revenue
    population: population
    media: [impressions_tv, impressions_search]
    media_spend: [spend_tv, spend_search]
    controls: [promo_flag, price_index]

model_spec:
  kwargs:
    max_lag: 8
    media_prior_type: roi

fit:
  n_chains: 4
  n_adapt: 500
  n_burnin: 500
  n_keep: 1000
  seed: 20260331

validation:
  strategy: blocked_tail
  holdout_size: 8

exports:
  export_predictive_accuracy: true
  export_review_summary: true
  export_model_selection: true

Choose the right validation path

Use blocked_tail when you want one contiguous future block for candidate evaluation. This is often the right default for short MMM time series. Use rolling_origin when you have enough history to evaluate more than one expanding-window split. Do not treat rolling_origin as ordinary k-fold cross-validation. The package does not implement naive IID folds or random shuffling because that is not the right statistical workflow for MMM time series.

Validation runs and the final production fit are different jobs. First, you evaluate candidate specifications on blocked time splits. Then, once you have chosen the specification, you run a separate full-sample fit with no holdout.

Run one blocked-tail candidate from the CLI

Once the YAML file is authored, you can execute a blocked-tail candidate run directly through the CLI:

meridian-tools run --config project.yml --output-dir runs

The same packaged runner surface is available through the thin repo-root wrapper:

python runme.py run --config project.yml --output-dir runs

This command creates a dated run directory under runs/. If you need to change the output location or the visible run name, pass --output-dir or --run-name at execution time. Those are runtime-only overrides. They affect the run directory and manifest, but they do not become part of the authored YAML contract.

Plan and run rolling-origin validation through the Python API

rolling_origin is a Python-first planning surface because you need one concrete split at a time. Start with an explicit YAML definition:

validation:
  strategy: rolling_origin
  initial_train_size: 52
  test_size: 4
  step_size: 4
  max_splits: 3

Then materialise and execute the validation runs:

from pathlib import Path

import pandas as pd

from meridian_tools.config import PipelineRunConfig, load_yaml_config
from meridian_tools.cv import build_validation_plan
from meridian_tools.runner import run_pipeline

config_path = Path("project.yml")
config = load_yaml_config(config_path)

data_path = config.data.path
if not data_path.is_absolute():
    data_path = (config_path.parent / data_path).resolve()

frame = pd.read_csv(data_path)
time_column = config.data.coord_to_columns["time"]
geo_column = config.data.coord_to_columns.get("geo")

time_index = frame[time_column].drop_duplicates().tolist()
geo_index = None
if geo_column is not None:
    geo_index = frame[geo_column].drop_duplicates().tolist()

validation_plan = build_validation_plan(
    config.validation,
    time_index=time_index,
    geo_index=geo_index,
)

for run_spec in validation_plan.validation_runs:
    run_pipeline(
        PipelineRunConfig(
            config_path=config_path,
            output_dir=Path("runs"),
            validation_spec=run_spec,
        )
    )

build_validation_plan(...) gives you one concrete ValidationRunSpec per split. run_pipeline(...) remains the primitive that executes one actual run.

Run the final full-sample fit separately

After you have chosen the winning specification, run the final fit on the full sample. Do not reuse a validation fit as the production artefact.

from pathlib import Path

from meridian_tools.config import PipelineRunConfig
from meridian_tools.runner import run_pipeline

final_result = run_pipeline(
    PipelineRunConfig(
        config_path=Path("project.yml"),
        output_dir=Path("runs"),
        validation_spec=validation_plan.final_fit_run,
    )
)

print(final_result.run_dir)
print(final_result.manifest_path)

For rolling_origin and blocked_tail workflows, validation_plan.final_fit_run is the explicit no-holdout runtime spec. It keeps the boundary clear. Candidate validation and final production fitting are separate steps.

Know which artefacts matter for handoff

Each successful run directory is the handoff unit. The important files are:

  • run_manifest.json for stage status, versions, timestamps, and top-level artefact links
  • 00_run_metadata/config.source.yaml for the authored source config
  • 00_run_metadata/config.resolved.yaml for the YAML-owned config after path resolution
  • 00_run_metadata/input_data_provenance.json for the exact dataset identity used by the run
  • 10_validation/validation_spec.json when the run is validation-aware
  • 30_model_assessment/diagnostics_bundle.json for stable diagnostics metadata
  • 30_model_assessment/model_results_summary.html for the wrapped Meridian assessment summary
  • 30_model_assessment/plots/ for assessment PNG plots such as model fit and rhat review
  • 40_decomposition/summary_metrics.csv and summary_metrics.nc for decomposition exports
  • 40_decomposition/plots/ for decomposition PNG plots
  • 60_response_curves/plots/response_curves_plot.png when response-curve export is enabled
  • 70_optimisation/plots/ when optimisation export is enabled
  • 30_model_assessment model-selection outputs when the run is compatible, or 30_model_assessment/model_selection_status.json when it is not

Read those artefacts together. 30_model_assessment/diagnostics_bundle.json tells you whether predictive accuracy and review summary were exported or disabled. The assessment stage either contains the real Bayesian model-selection outputs or one explicit compatibility status artefact.

The supported Bayesian model-selection boundary is narrow and deliberate. The package supports fitted Meridian models where holdout_id is None. That means full-sample fitted models and explicit final-fit runs are compatible. Validation fits and authored holdout fits are not.

Use lifecycle helpers after a run exists

Once you have stored run directories, the lifecycle API lets you reload, compare, and refresh them without going back to notebook state.

from pathlib import Path

from meridian_tools.lifecycle import compare_run_records, load_run_record, refresh_run

validation_run_dir = Path("runs/client-mmm_blocked_tail_20260401_101500")
final_fit_run_dir = Path("runs/client-mmm_final_fit_20260401_114200")

final_fit_record = load_run_record(final_fit_run_dir)
comparison = compare_run_records(validation_run_dir, final_fit_run_dir)
refreshed = refresh_run(final_fit_run_dir, run_name="client-mmm_final_fit_refresh")

print(final_fit_record.manifest.run_name)
print(comparison)
print(refreshed.run_dir)

compare_run_records(...) gives you a metadata-level comparison. It does not attempt a raw-file diff across every output. refresh_run(...) rebuilds a new sibling run from the stored run-local artefacts. It does not overwrite the source run. Phase 07 does not provide lifecycle CLI commands, so use the Python API for these operations.

Know the staged output schema

The current run layout is:

<run_dir>/
  run_manifest.json
  00_run_metadata/
  10_validation/
  20_model_fit/
  30_model_assessment/
    plots/
  40_decomposition/
    plots/
  60_response_curves/
    plots/
  70_optimisation/
    plots/

The runner always writes:

  • 00_run_metadata
  • 20_model_fit
  • 30_model_assessment
  • 40_decomposition

The runner writes these only when applicable:

  • 10_validation
  • 60_response_curves
  • 70_optimisation

For the bundled reference examples and the exact stage-level file set, see demos.md.

A practical analyst sequence

If you want one concrete operating pattern, use this one. Author a YAML file. Run a blocked-tail candidate through the CLI when you need one held-out tail block. Use rolling_origin through build_validation_plan(...) when you need multiple expanding-window validation splits. Choose the modelling specification. Run the final full-sample fit as its own job. Review the run directory artefacts. Then use compare_run_records(...) and refresh_run(...) when you need to inspect or rerun stored work later.

Meridian Tools demo guide

This is the canonical guide to the bundled meridian-tools demos. Use it when you want one safe, reproducible, end-to-end example without client data.

The public story is simple:

  • Meridian is the modelling engine.
  • meridian-tools is the workflow wrapper.
  • The bundled demos are launched through meridian-tools surfaces, not by calling Meridian directly.

What the bundled demos are for

Phase 08 adds two bundled reference workflows:

  • timeseries
    • a national timeseries demo shipped as packaged demo data
  • geo_panel
    • a geo-panel demo shipped as packaged demo data

Both datasets are bundled non-client reference data. They exist so analysts and stakeholders can inspect the workflow, run structure, and review artefacts without using client material.

What the package adds on top of Meridian

Meridian remains responsible for the modelling and analysis primitives. meridian-tools adds the operational surface that agencies usually need around it:

  • typed YAML configuration
  • blocked-tail and rolling-origin validation workflow
  • manifest-backed run directories
  • diagnostics bundling
  • compatibility-aware Bayesian model-selection outputs
  • lifecycle compare and refresh helpers
  • a thin demo launcher for bundled reference workflows

This is why the demos are useful. They show the wrapper workflow directly, rather than asking users to reconstruct it from notebooks or internal scripts.

Demo entrypoints

List the supported demos:

meridian-tools demo --list

Run the bundled timeseries demo:

meridian-tools demo timeseries

Run the bundled geo-panel demo:

meridian-tools demo geo_panel

By default, demo runs are written under runs/demos/. If you want a different root, pass --output-dir. If you want a custom visible run name, pass --run-name.

Example:

meridian-tools demo timeseries --output-dir sandbox/demo-runs --run-name demo-timeseries-review

The repo-root checkout wrapper remains available when you are working from the source tree:

python runme.py demo --list
python runme.py demo timeseries

The same package can also run an explicit authored config:

meridian-tools run --config /path/to/project.yml --output-dir runs

The repo-root wrapper can run an explicit authored config too:

python runme.py run --config /path/to/project.yml --output-dir runs

Bundled YAML surface

The bundled demo YAML files are real meridian-tools configs. They are not legacy Abacus-style placeholders.

The authored sections used in Phase 08 are:

  • project
  • data
  • model_spec
  • fit
  • validation
  • exports
  • response_curves
  • optimisation

The Phase 08 additions are:

  • response_curves
    • required if you want the response-curve export stage to run
  • optimisation
    • required if you want the optimisation export stage to run

The bundled demos include both sections so that the full staged schema is exercised.

The default demo configs use validation.strategy: none. That keeps the reference runs model-selection compatible, so LOO and WAIC outputs are written by default.

Output schema

Each successful demo run writes one manifest-backed staged directory layout:

<run_dir>/
  run_manifest.json
  00_run_metadata/
    config.source.yaml
    config.resolved.yaml
  20_model_fit/
    meridian_model.binpb
    fit_metadata.json
  30_model_assessment/
    diagnostics_bundle.json
    predictive_accuracy.csv
    review_summary.json
    model_results_summary.html
    plots/
      model_fit.png
      rhat_boxplot.png
    loo_summary.json
    waic_summary.json
    loo_pointwise.csv
    waic_pointwise.csv
    model_comparison.csv
    # or model_selection_status.json when unavailable
  40_decomposition/
    summary_metrics.nc
    summary_metrics.csv
    plots/
      channel_contribution_area_chart.png
      contribution_waterfall_chart.png
      spend_vs_contribution_chart.png
      roi_bar_chart.png
  60_response_curves/
    response_curves.nc
    response_curves.csv
    plots/
      response_curves_plot.png
  70_optimisation/
    optimisation_summary.html
    optimised_data.nc
    optimised_data.csv
    nonoptimised_data.nc
    nonoptimised_data.csv
    optimisation_grid.csv
    plots/
      incremental_outcome_delta_plot.png
      budget_allocation_optimised_plot.png
      budget_allocation_nonoptimised_plot.png
      spend_delta_plot.png
      optimisation_response_curves_plot.png

run_manifest.json stays top-level and remains the source of truth for artefact discovery, stage status, version metadata, and relative file paths.

Always exported versus config-gated outputs

For the current Phase 08 contract:

  • always exported for successful runs:
    • 00_run_metadata
    • 20_model_fit
    • 30_model_assessment
    • 40_decomposition
    • exported only when applicable:
    • 10_validation
      • written for validation-aware runs
      • skipped for runs with no validation metadata
    • 60_response_curves
      • requires the authored response_curves section
    • 70_optimisation
      • requires the authored optimisation section

Within 30_model_assessment, model selection remains compatibility-aware:

  • compatible runs write loo, waic, and comparison outputs
  • incompatible runs write model_selection_status.json
  • compatibility unavailability is non-fatal

How to read the important outputs

Start with these artefacts:

  • run_manifest.json
    • run identity, versions, timestamps, stage status, and top-level artefact links
  • 00_run_metadata/config.source.yaml
    • the authored YAML
  • 00_run_metadata/config.resolved.yaml
    • the same YAML after runtime path resolution
  • 10_validation/validation_spec.json
    • validation provenance for validation-aware runs only
    • not present in the default bundled demos because they run as full-sample fits
  • 30_model_assessment/diagnostics_bundle.json
    • the stable machine-readable record of diagnostics export state
  • 30_model_assessment/model_results_summary.html
    • the wrapped Meridian assessment summary
  • 40_decomposition/summary_metrics.csv
    • the easiest tabular decomposition output to inspect first

For model selection, keep the boundary honest:

  • LOO and WAIC are only available for compatible fitted Meridian models
  • validation fits and other incompatible cases will record model_selection_status.json instead
  • the package does not pretend unsupported runs have valid Bayesian comparison outputs
  • the bundled demos are configured as full-sample fits, so they should write loo_summary.json and waic_summary.json by default

For response curves and optimisation:

  • these outputs are useful for scenario and allocation review
  • they are not a substitute for checking diagnostics, validation provenance, or model-selection compatibility first

For visual review, each stage now keeps its PNG exports inside a local plots/ subdirectory rather than mixing image files into the stage root. That keeps the machine-readable exports and the human-review plots in one predictable place.

If you are new to the repository, use this order:

  1. run meridian-tools demo --list
  2. run one of the bundled demos
  3. open run_manifest.json
  4. inspect 00_run_metadata/config.source.yaml
  5. inspect 30_model_assessment/diagnostics_bundle.json
  6. inspect 40_decomposition/summary_metrics.csv
  7. inspect 60_response_curves/ and 70_optimisation/ if those stages ran

If you are working from a source checkout, python runme.py demo --list and python runme.py demo ... remain equivalent convenience wrappers.

That sequence shows the wrapper value quickly: one YAML config in, one structured run directory out, with the Meridian and meridian-tools artefacts kept in one predictable place.

Troubleshooting

Common issues and solutions when working with meridian-tools.

Installation issues

meridian-tools --help fails with ImportError

Cause: The package is not installed in the active environment, or Meridian is missing.

Fix:

pip install -e ".[dev]"

If Meridian is not installed:

pip install "google-meridian[schema]==1.5.3"

RuntimeError: Saving meridian_model.binpb requires Meridian schema support

Cause: Meridian was installed without the [schema] extra.

Fix:

pip install "google-meridian[schema]==1.5.3"

RuntimeError: Saving PNG plots requires vl-convert-python

Cause: The vl-convert-python package is not installed or not importable.

Fix:

pip install "vl-convert-python>=1.7.0,<2"

Configuration errors

pydantic.ValidationError: Extra inputs are not permitted

Cause: The YAML file contains a key that is not part of the schema. This is often a typo.

Fix: Check the key name against the YAML schema reference. All config models use extra="forbid", so unexpected keys are always rejected.

Legacy holdout_size shorthand is no longer supported

Cause: The YAML has validation.holdout_size without an explicit validation.strategy.

Fix: Add strategy: blocked_tail:

validation:
  strategy: blocked_tail
  holdout_size: 8

validation.strategy: blocked_tail does not accept rolling-origin parameters

Cause: The YAML mixes blocked_tail strategy with initial_train_size, test_size, or other rolling-origin fields.

Fix: Choose one strategy. Use blocked_tail with holdout_size only, or rolling_origin with its own parameters.

optimisation.end_date must be on or after optimisation.start_date

Cause: The dates in the optimisation section are reversed.

Fix: Ensure start_date precedes end_date:

optimisation:
  start_date: "2025-01-01"
  end_date: "2025-12-31"

response_curves.spend_multipliers must not be empty

Cause: The spend_multipliers list is empty or missing.

Fix: Provide at least one non-negative value:

response_curves:
  spend_multipliers: [0.0, 0.5, 1.0, 1.5, 2.0]

Pipeline execution errors

Dependency preflight failure

Cause: A required wrapper dependency check failed before config/data preflight or run-directory creation.

Common triggers:

  • google-meridian[schema] support is unavailable
  • exports.export_plots: true is set but vl-convert-python PNG support is unavailable

Fix: Install or repair the missing runtime dependency first, then rerun.

ConfigPreflightError

Cause: meridian-tools found a wrapper-owned config or input-data issue before run-directory creation.

Common triggers:

  • data.path resolves to a missing file or a directory
  • the CSV header row cannot be read
  • the header is empty or contains blank cells
  • an authored column name does not appear in the header exactly
  • a supported media/RF family is only half-authored

Fix: Correct the authored YAML or the input CSV first, then rerun. Header matching is exact and case-sensitive in Phase 10.

ValidationExecutionContractError

Cause: The requested single-run execution path is incompatible with the authored validation setup.

Common triggers:

  • you tried to run a rolling_origin config directly from the CLI or run_pipeline(...)
  • you passed PipelineRunConfig.validation_spec while the YAML already authors model_spec.kwargs.holdout_id

Fix: For rolling_origin, build a validation plan and execute one concrete split at a time through the Python API. For authored holdouts, either keep the YAML-authored holdout_id path or remove it before supplying a runtime validation_spec. See the validation guide for the full workflow.

ModelSelectionError with reason_code: holdout_fit_unsupported

Cause: LOO/WAIC was requested for a model fitted with a holdout mask.

Not a bug. Model selection is only available for full-sample fits. The pipeline records the incompatibility in model_selection_status.json and continues. See the model selection guide.

ModelSelectionError with reason_code: meridian_internal_seam_incompatible

Cause: The installed Meridian version does not expose the internal reconstruction methods needed for log-likelihood computation.

Fix: Check the Meridian version. This package requires google-meridian[schema]==1.5.3. If you recently upgraded Meridian, the private reconstruction seams may have changed. Check the Meridian integration notes.

Run fails mid-pipeline

If a run fails after the dated run directory already exists, meridian-tools raises PipelineRunFailure. The CLI and runme.py print the concrete failed run directory, manifest path, and stage name when available. The original exception is preserved as __cause__, so --traceback still shows the underlying failure.

The manifest is written to disk after each stage. If a run fails, the run_manifest.json is left on disk and marked failed. You can inspect it to determine which stage failed:

cat runs/my-project_*/run_manifest.json | python -m json.tool

Look at the stages array. A failed stage is recorded with status: "failed" and an error message.

Validation errors

time_index must be strictly increasing with no duplicate values

Cause: The time column in your data contains duplicates or is not sorted.

Fix: Ensure your CSV data has unique, monotonically increasing time values. For geo-panel data, the time column should be unique per time period (not per geo × time combination — the function expects the deduplicated time axis).

rolling_origin must yield at least two splits

Cause: The combination of initial_train_size, test_size, and data length does not produce enough splits.

Fix: Either reduce initial_train_size, reduce test_size, or use blocked_tail instead for shorter series.

holdout_size must be smaller than the time axis

Cause: The holdout size is greater than or equal to the number of time periods.

Fix: Reduce holdout_size to leave at least one training period.

Lifecycle errors

LifecycleError when loading a run record

Cause: The run manifest is missing required entries, references a file that does not exist, or has a malformed JSON structure.

Fix: Check that the run directory was not manually modified. Required artefacts are config.source.yaml and config.resolved.yaml. diagnostics_bundle.json is optional for loading but required for new runs.

Path traversal rejection

Cause: An artefact path in the manifest resolves outside the run directory.

Not fixable by editing the manifest. This is a security check. The manifest was likely corrupted or manually edited with an invalid path.

Performance issues

Pipeline takes very long

MCMC sampling (the 20_model_fit stage) dominates wall-clock time. The meridian-tools orchestration layer adds negligible overhead.

To speed up exploratory runs:

fit:
  n_chains: 2       # Fewer chains (minimum 1)
  n_adapt: 200      # Fewer adaptation steps
  n_burnin: 200     # Fewer burn-in steps
  n_keep: 500       # Fewer kept samples

For production runs, use the defaults or increase these values for better posterior quality.

Out-of-memory during model selection

Log-likelihood reconstruction loads the full posterior into memory and creates a temporary copy of the InferenceData. For large models, this can double memory usage temporarily.

Mitigation: Reduce n_keep or n_chains if memory is constrained.

Warnings

ArviZ Pareto k warnings

Estimated shape parameter of Pareto distribution is greater than 0.7 ...

This means the LOO approximation is unreliable for some observations. Check the pointwise pareto_k values in loo_pointwise.csv. Values above 0.7 indicate influential observations.

Meridian national model auto-zeroing warnings

Hierarchical distribution parameters must be deterministically zero for
national models. eta_orf has been automatically set to Deterministic(0).

This is expected for national (non-geo) models. Meridian automatically zeros out geo-level hierarchical parameters. The warning is informational.

TensorFlow deprecation warnings

These come from TensorFlow and Meridian internals. meridian-tools groups and deduplicates them in the terminal output to reduce noise. They do not indicate a problem with your run.

Reference

Lookup documentation for the CLI, YAML schema, manifest schema, output layout, and related contracts.

Pages

  • CLI referencemeridian-tools provides a command-line interface with two subcommands: run and demo.
  • YAML configuration schema reference — This is the complete field-level reference for meridian-tools YAML configuration files. For usage guidance, see the configuration guide.
  • Manifest schema reference — The run_manifest.json file is the source of truth for every meridian-tools run. It lives at the root of the run directory and records identity, timing, versions, overall status, top-level artefact index, and per-stage records.
  • Output schema reference — This page documents the complete run directory layout produced by meridian-tools. Every successful pipeline run creates a timestamped directory containing the artefacts described below.
  • Validation spec schema reference — The validation_spec.json artefact is written to 10_validation/ for every validation-aware pipeline run. It records the concrete validation provenance for that specific run, including the holdout strategy, split geometry, and date windows.

Subsections of Reference

CLI reference

meridian-tools provides a command-line interface with two subcommands: run and demo.

Global usage

meridian-tools <subcommand> [options]

meridian-tools run

Execute a meridian-tools pipeline run from an authored YAML config.

meridian-tools run --config <path> [--output-dir <dir>] [--run-name <name>] [--traceback]

Arguments

Argument Required Default Description
--config Yes Path to the meridian-tools YAML configuration file.
--output-dir No runs Directory where dated run folders will be created.
--run-name No project.name from YAML Optional run name override.
--traceback No false Show the full Python traceback on failure.

Examples

# Basic run
meridian-tools run --config project.yml

# Custom output directory
meridian-tools run --config project.yml --output-dir output/model_runs

# Named run with traceback on failure
meridian-tools run --config project.yml --run-name client-q1-review --traceback

Exit codes

Code Meaning
0 Pipeline completed successfully.
1 Pipeline failed. Error details are printed to stderr. Use --traceback for the full stack trace.

Failure reporting

The CLI distinguishes five broad failure classes:

  • config loading or Pydantic validation failures before wrapper preflight
  • dependency preflight failures before run-directory creation
  • validation-execution contract failures before run-directory creation
  • wrapper-owned ConfigPreflightError failures before run-directory creation
  • PipelineRunFailure after the dated run directory already exists

Dependency preflight covers google-meridian[schema] support and optional plot-export support. Validation-execution contract failures cover incompatible single-run validation requests such as direct rolling_origin execution. Wrapper preflight covers only the closed config/data matrix documented in the configuration guide.

For PipelineRunFailure, the CLI prints the concrete failed run directory, manifest path, and stage name when available so the partial run can be inspected immediately. --traceback still shows the original underlying exception because it is preserved through __cause__.

Validation strategy restrictions

The CLI executes a single pipeline run. Configs with validation.strategy: rolling_origin cannot be run directly from the CLI because they require multiple sequential runs. Use the Python API for rolling-origin workflows.

Configs with strategy: none or strategy: blocked_tail work directly from the CLI.

meridian-tools demo

Run one of the bundled reference demos or list available demos.

meridian-tools demo [<name>] [--list] [--output-dir <dir>] [--run-name <name>] [--traceback]

Arguments

Argument Required Default Description
<name> Yes (unless --list) Bundled demo name to execute. One of: timeseries, geo_panel.
--list No false List supported demos and exit. Cannot combine with a demo name.
--output-dir No runs/demos/ (source checkout) or ./runs/demos/ (installed) Override the output root directory.
--run-name No None (uses project.name from the demo config) Optional run name override.
--traceback No false Show the full Python traceback on failure.

Examples

# List available demos
meridian-tools demo --list

# Run the timeseries demo
meridian-tools demo timeseries

# Run with a custom output directory
meridian-tools demo geo_panel --output-dir sandbox/demo-output

# Run with a custom name
meridian-tools demo timeseries --run-name demo-review-q2

Available demos

Name Description
timeseries National timeseries demo using bundled reference data.
geo_panel Geo-panel demo using bundled reference data.

Both demos exercise the full staged pipeline including response curves and optimisation.

Lightweight import

The CLI is designed for fast startup. Running meridian-tools --help or meridian-tools demo --list does not import TensorFlow, NumPy, Meridian, or ArviZ. Heavy imports are deferred until pipeline execution begins.

Entrypoints

The primary CLI entrypoint is the console script registered in pyproject.toml:

[project.scripts]
meridian-tools = "meridian_tools.cli:main"

The supported module-path equivalent is:

python -m meridian_tools.cli run --config project.yml

The package-level form below is not part of the supported contract in this milestone:

python -m meridian_tools run --config project.yml

Source-tree wrapper

When working from the source checkout, runme.py provides equivalent functionality:

python runme.py run --config project.yml --output-dir runs
python runme.py demo timeseries
python runme.py demo --list

See the demo guide for more details on the runme.py wrapper.

YAML configuration schema reference

This is the complete field-level reference for meridian-tools YAML configuration files. For usage guidance, see the configuration guide.

All configuration models use Pydantic extra="forbid" — any key not listed here will produce a validation error.

Top-level structure

project: ProjectConfig         # optional, has defaults
data: CsvDataConfig            # required
model_spec: ModelSpecConfig    # optional, has defaults
fit: FitConfig                 # optional, has defaults
validation: ValidationConfig   # optional, has defaults
exports: ExportsConfig         # optional, has defaults
response_curves: ResponseCurvesConfig | null   # optional
optimisation: OptimisationConfig | null         # optional

project

Field Type Default Description
name str "meridian-project" Human-readable project name. Used as the base for run directory names.

data

Field Type Default Description
path Path required Path to CSV data file. Relative paths resolve against the YAML file’s directory.
kpi_type "revenue" | "non-revenue" "revenue" KPI type for Meridian’s data loader.
coord_to_columns dict[str, Any] required Maps Meridian coordinate names to CSV column names. Must include time.
media_to_channel dict[str, str] | null null Optional media-to-channel mapping override.
media_spend_to_channel dict[str, str] | null null Optional media-spend-to-channel mapping override.
reach_to_channel dict[str, str] | null null Optional reach-to-channel mapping override.
frequency_to_channel dict[str, str] | null null Optional frequency-to-channel mapping override.
rf_spend_to_channel dict[str, str] | null null Optional RF-spend-to-channel mapping override.
organic_reach_to_channel dict[str, str] | null null Optional organic-reach-to-channel mapping override.
organic_frequency_to_channel dict[str, str] | null null Optional organic-frequency-to-channel mapping override.

model_spec

Field Type Default Description
kwargs dict[str, Any] {} Keyword arguments forwarded directly to Meridian ModelSpec(**kwargs).

Supported kwargs keys include any argument accepted by Meridian’s ModelSpec constructor: max_lag, media_prior_type, holdout_id, etc. If holdout_id is present, the run is treated as an authored-holdout validation run.

Array-valued keys (holdout_id, control_population_scaling_id, non_media_population_scaling_id, rf_roi_calibration_period, roi_calibration_period) are converted to NumPy arrays at runtime.


fit

Field Type Default Constraint Description
sample_prior_draws PositiveInt | null null >0 if set Number of prior predictive draws. null skips prior sampling.
n_chains PositiveInt | list[PositiveInt] 4 >0 Number of MCMC chains.
n_adapt PositiveInt 500 >0 Adaptation steps per chain.
n_burnin PositiveInt 500 >0 Burn-in steps per chain.
n_keep PositiveInt 1000 >0 Posterior samples to retain per chain.
seed int | list[int] | null null RNG seed for reproducibility.
max_tree_depth PositiveInt 10 >0 NUTS maximum tree depth.
max_energy_diff float 500.0 NUTS maximum energy difference.
unrolled_leapfrog_steps PositiveInt 1 >0 NUTS unrolled leapfrog steps.
parallel_iterations PositiveInt 10 >0 TensorFlow parallel iterations.

validation

Field Type Default Constraint Description
strategy "none" | "blocked_tail" | "rolling_origin" "none" Validation strategy.
holdout_size PositiveInt | null null Required for blocked_tail Number of tail time periods to hold out.
initial_train_size PositiveInt | null null Required for rolling_origin Initial training window size.
test_size PositiveInt | null null Required for rolling_origin Test window size per split.
step_size PositiveInt | null null Must equal test_size Step between rolling splits. Defaults to test_size.
max_splits PositiveInt | null null >=2 if set Maximum number of rolling splits.

Cross-field validation rules

  • strategy: none rejects all holdout and rolling-origin parameters.
  • strategy: blocked_tail requires holdout_size, rejects rolling-origin parameters.
  • strategy: rolling_origin requires initial_train_size and test_size, rejects holdout_size.
  • holdout_size without an explicit strategy is rejected (legacy shorthand removed).
  • Rolling-origin parameters without strategy: rolling_origin are rejected.

exports

Field Type Default Description
use_kpi bool false Use KPI-based metrics in Meridian analysis surfaces.
batch_size PositiveInt 1000 Batch size for Meridian Analyzer computations.
export_predictive_accuracy bool true Write predictive_accuracy.csv.
export_review_summary bool true Write review_summary.json.
export_model_selection bool true Write LOO/WAIC outputs (when compatible).
export_plots bool true Write PNG plot artefacts in each stage.

response_curves

Optional section. If omitted or null, the response curves stage is skipped.

Field Type Default Constraint Description
spend_multipliers list[float] required Non-empty, all >=0 Spend multiplier grid for response curve computation.
use_posterior bool true Use posterior (vs prior) for response curves.
by_reach bool true Compute reach-based response curves.
use_optimal_frequency bool false Use optimal frequency in computation.
confidence_level float 0.9 0 < x < 1 Confidence level for credible intervals.

optimisation

Optional section. If omitted or null, the optimisation stage is skipped.

Field Type Default Constraint Description
start_date str required ISO YYYY-MM-DD Start of the optimisation window.
end_date str required ISO YYYY-MM-DD, >= start_date End of the optimisation window.
budget OptimisationBudgetConfig required Budget specification (see below).
use_posterior bool true Use posterior (vs prior) for optimisation.
use_optimal_frequency bool true Use optimal frequency in optimisation.
confidence_level float 0.9 0 < x < 1 Confidence level for credible intervals.

optimisation.budget

Field Type Default Constraint Description
mode "fixed_total" | "relative_reference_window_total" required Budget mode.
value PositiveFloat required >0 Budget value. Absolute for fixed_total, multiplier for relative_reference_window_total.

When mode: relative_reference_window_total, the effective budget is value × total_spend_in_reference_window. The reference window is defined by start_date and end_date.

Manifest schema reference

The run_manifest.json file is the source of truth for every meridian-tools run. It lives at the root of the run directory and records identity, timing, versions, overall status, top-level artefact index, and per-stage records.

Current version

The current manifest version is 3. Versions 0, 1, and 2 are supported for backward compatibility when loading older run directories.

Top-level fields

Field Type Description
manifest_version int Schema version (0, 1, 2, or 3).
run_name str Human-readable run name.
config_path str Path to the source YAML used for this run. For refresh runs this points to the source run’s archived config.source.yaml.
output_dir str Path to the run directory.
started_at str UTC ISO-8601 timestamp when the run began.
status str Overall run status: "running", "completed", or "failed".
finished_at str | null UTC ISO-8601 timestamp when the run finished. null while running.
meridian_tools_version str Version of meridian-tools that produced the run.
meridian_version str | null Version of Google Meridian used. null if not yet recorded.
artifacts dict[str, str] Top-level artefact index. Key artefacts from stages are promoted here for quick lookup.
stages list[StageRecord] Ordered list of pipeline stage records (including skipped and failed stages).

Top-level artifacts index

The runner promotes key artefacts into the top-level artifacts dictionary after each stage completes. Promoted artefact names include:

  • config_source, config_resolved, input_data_provenance (from 00_run_metadata)
  • validation_spec (from 10_validation)
  • meridian_model (from 20_model_fit)
  • diagnostics_bundle, model_results_summary (from 30_model_assessment)
  • summary_metrics_csv, summary_metrics_nc (from 40_decomposition)

This index provides flat access to important artefacts without walking the stages array.

StageRecord fields

Each entry in the stages array represents one pipeline stage. Stages can have any of four statuses: "running", "completed", "skipped", or "failed".

Field Type Description
name str Stage identifier (for example, "00_run_metadata", "20_model_fit").
status str Stage status: "running", "completed", "skipped", or "failed".
started_at str | null UTC ISO-8601 timestamp when the stage began.
finished_at str | null UTC ISO-8601 timestamp when the stage finished.
elapsed_seconds float | null Wall-clock seconds for stage execution.
message str | null Human-readable message. Present for skipped stages (reason) and failed stages (error).
artifacts dict[str, str] Map of artefact names to relative file paths. Empty for skipped stages.

Artefact path convention

All artefact paths in the manifest are relative to the run directory. This makes run directories portable across machines and file systems. When you load a run record through load_run_record, the lifecycle layer resolves relative paths to absolute paths against the run directory.

Example stage record:

{
  "name": "30_model_assessment",
  "status": "completed",
  "started_at": "2026-04-02T07:40:30+00:00",
  "finished_at": "2026-04-02T07:41:00+00:00",
  "elapsed_seconds": 30.1,
  "message": null,
  "artifacts": {
    "diagnostics_bundle": "30_model_assessment/diagnostics_bundle.json",
    "review_summary": "30_model_assessment/review_summary.json",
    "model_results_summary": "30_model_assessment/model_results_summary.html"
  }
}

Stage names and ordering

All seven stages are always recorded in execution order. Stages that do not apply to a given run are recorded with status: "skipped".

Stage name Number Skippable Description
00_run_metadata 00 No Config archival and input-data provenance capture.
10_validation 10 Yes Validation spec (skipped when no validation applies).
20_model_fit 20 No Meridian model fitting.
30_model_assessment 30 No Diagnostics, model selection.
40_decomposition 40 No Media decomposition metrics.
60_response_curves 60 Yes Response curves (skipped when the config section is absent).
70_optimisation 70 Yes Budget optimisation (skipped when the config section is absent).

The numbering gap at 50 is intentional, reserving space for future stages.

Required artefacts

The lifecycle layer requires the following top-level artefacts to be present in the manifest for a run to be loadable:

  • config_source (promoted from 00_run_metadata)
  • config_resolved (promoted from 00_run_metadata)
  • input_data_provenance (promoted from 00_run_metadata) for manifest version 3 runs

These are enforced by _require_manifest_artifact in load_run_record. If a required entry is missing, a LifecycleError is raised.

The diagnostics_bundle artefact is treated as optional by the lifecycle loader. If it is absent from the manifest, RunRecord.diagnostics_bundle_path is None. However, diagnostics_bundle is listed in REQUIRED_MANIFEST_ARTIFACTS and validated at run completion time — so new runs always produce it, but older or partial runs can still be loaded without it.

Input-data provenance payload

Manifest version 3 introduces 00_run_metadata/input_data_provenance.json. This file records the pinned Phase 09 input-data contract:

  • provenance_version
  • authored_path
  • resolved_path
  • sha256
  • size_bytes
  • mtime_utc
  • row_count
  • column_count
  • ordered_columns

The lifecycle compare surface uses these fields to distinguish real dataset changes from older runs whose manifests predate provenance capture.

Example manifest

{
  "manifest_version": 3,
  "run_name": "my-project_blocked_tail",
  "config_path": "/workspace/configs/project.yml",
  "output_dir": "/workspace/runs/my-project_blocked_tail_20260402_073500",
  "started_at": "2026-04-02T07:35:00+00:00",
  "status": "completed",
  "finished_at": "2026-04-02T07:42:15+00:00",
  "meridian_tools_version": "0.3.0",
  "meridian_version": "1.5.3",
  "artifacts": {
    "config_source": "00_run_metadata/config.source.yaml",
    "config_resolved": "00_run_metadata/config.resolved.yaml",
    "input_data_provenance": "00_run_metadata/input_data_provenance.json",
    "validation_spec": "10_validation/validation_spec.json",
    "meridian_model": "20_model_fit/meridian_model.binpb",
    "diagnostics_bundle": "30_model_assessment/diagnostics_bundle.json",
    "model_results_summary": "30_model_assessment/model_results_summary.html",
    "summary_metrics_csv": "40_decomposition/summary_metrics.csv",
    "summary_metrics_nc": "40_decomposition/summary_metrics.nc"
  },
  "stages": [
    {
      "name": "00_run_metadata",
      "status": "completed",
      "started_at": "2026-04-02T07:35:00+00:00",
      "finished_at": "2026-04-02T07:35:01+00:00",
      "elapsed_seconds": 0.5,
      "message": null,
      "artifacts": {
        "config_source": "00_run_metadata/config.source.yaml",
        "config_resolved": "00_run_metadata/config.resolved.yaml",
        "input_data_provenance": "00_run_metadata/input_data_provenance.json"
      }
    },
    {
      "name": "10_validation",
      "status": "completed",
      "started_at": "2026-04-02T07:35:01+00:00",
      "finished_at": "2026-04-02T07:35:01+00:00",
      "elapsed_seconds": 0.1,
      "message": null,
      "artifacts": {
        "validation_spec": "10_validation/validation_spec.json"
      }
    },
    {
      "name": "20_model_fit",
      "status": "completed",
      "started_at": "2026-04-02T07:35:01+00:00",
      "finished_at": "2026-04-02T07:40:30+00:00",
      "elapsed_seconds": 329.0,
      "message": null,
      "artifacts": {
        "meridian_model": "20_model_fit/meridian_model.binpb",
        "fit_metadata": "20_model_fit/fit_metadata.json"
      }
    },
    {
      "name": "30_model_assessment",
      "status": "completed",
      "started_at": "2026-04-02T07:40:30+00:00",
      "finished_at": "2026-04-02T07:41:00+00:00",
      "elapsed_seconds": 30.1,
      "message": null,
      "artifacts": {
        "diagnostics_bundle": "30_model_assessment/diagnostics_bundle.json",
        "review_summary": "30_model_assessment/review_summary.json",
        "model_results_summary": "30_model_assessment/model_results_summary.html",
        "model_selection_status": "30_model_assessment/model_selection_status.json"
      }
    },
    {
      "name": "40_decomposition",
      "status": "completed",
      "started_at": "2026-04-02T07:41:00+00:00",
      "finished_at": "2026-04-02T07:42:00+00:00",
      "elapsed_seconds": 60.0,
      "message": null,
      "artifacts": {
        "summary_metrics_nc": "40_decomposition/summary_metrics.nc",
        "summary_metrics_csv": "40_decomposition/summary_metrics.csv"
      }
    },
    {
      "name": "60_response_curves",
      "status": "skipped",
      "started_at": "2026-04-02T07:42:00+00:00",
      "finished_at": "2026-04-02T07:42:00+00:00",
      "elapsed_seconds": 0.0,
      "message": "No `response_curves` section was authored in the YAML config.",
      "artifacts": {}
    },
    {
      "name": "70_optimisation",
      "status": "skipped",
      "started_at": "2026-04-02T07:42:00+00:00",
      "finished_at": "2026-04-02T07:42:00+00:00",
      "elapsed_seconds": 0.0,
      "message": "No `optimisation` section was authored in the YAML config.",
      "artifacts": {}
    }
  ]
}

Version history

Version 3 (current)

Added input_data_provenance.json and made provenance available to lifecycle loading and compare surfaces.

Version 2

Added export_plots support, top-level artifacts index, status field, config_path, output_dir, and per-stage status, elapsed_seconds, and message fields.

Version 1

Added meridian_version field and response_curves / optimisation stages.

Version 0

Initial manifest schema with core stages and artefact tracking.

All four versions are supported by RunManifest.from_dict. Missing fields in older versions are filled with defaults.

Output schema reference

This page documents the complete run directory layout produced by meridian-tools. Every successful pipeline run creates a timestamped directory containing the artefacts described below.

Run directory structure

<run_name>_<YYYYMMDD_HHMMSS>/
├── run_manifest.json                        # Source of truth for the run
├── 00_run_metadata/
│   ├── config.source.yaml                   # Verbatim copy of the authored YAML
│   ├── config.resolved.yaml                 # YAML after path resolution
│   └── input_data_provenance.json           # Pinned source/resolution/hash metadata
├── 10_validation/                           # Only for validation-aware runs
│   └── validation_spec.json                 # Validation provenance record
├── 20_model_fit/
│   ├── meridian_model.binpb                 # Serialised Meridian model
│   └── fit_metadata.json                    # Fit settings and Meridian version
├── 30_model_assessment/
│   ├── diagnostics_bundle.json              # Diagnostics export manifest
│   ├── predictive_accuracy.csv              # Per-observation accuracy metrics
│   ├── review_summary.json                  # Meridian review battery results
│   ├── model_results_summary.html           # Meridian HTML summary report
│   ├── plots/                               # When export_plots: true
│   │   ├── model_fit.png
│   │   └── rhat_boxplot.png
│   │
│   │  # Model selection outputs (compatible runs):
│   ├── loo_summary.json                     # LOO summary statistics
│   ├── waic_summary.json                    # WAIC summary statistics
│   ├── loo_pointwise.csv                    # Per-observation LOO + Pareto k
│   ├── waic_pointwise.csv                   # Per-observation WAIC
│   └── model_comparison.csv                 # Ranked comparison table
│   │
│   │  # Model selection status (incompatible runs):
│   └── model_selection_status.json          # Reason code for unavailability
├── 40_decomposition/
│   ├── summary_metrics.nc                   # NetCDF decomposition dataset
│   ├── summary_metrics.csv                  # Tabular decomposition
│   └── plots/                               # When export_plots: true
│       ├── channel_contribution_area_chart.png
│       ├── contribution_waterfall_chart.png
│       ├── spend_vs_contribution_chart.png
│       └── roi_bar_chart.png
├── 60_response_curves/                      # Only when response_curves configured
│   ├── response_curves.nc                   # NetCDF response curve dataset
│   ├── response_curves.csv                  # Tabular response curves
│   └── plots/                               # When export_plots: true
│       └── response_curves_plot.png
└── 70_optimisation/                         # Only when optimisation configured
    ├── optimisation_summary.html            # Meridian optimisation HTML report
    ├── optimised_data.nc                    # Optimised allocation (NetCDF)
    ├── optimised_data.csv                   # Optimised allocation (CSV)
    ├── nonoptimised_data.nc                 # Baseline allocation (NetCDF)
    ├── nonoptimised_data.csv                # Baseline allocation (CSV)
    ├── optimisation_grid.csv                # Full optimisation grid
    └── plots/                               # When export_plots: true
        ├── incremental_outcome_delta_plot.png
        ├── budget_allocation_optimised_plot.png
        ├── budget_allocation_nonoptimised_plot.png
        ├── spend_delta_plot.png
        └── optimisation_response_curves_plot.png

Stage details

00_run_metadata

Always present. Created first.

Artefact Format Description
config.source.yaml YAML Verbatim copy of the source config for this run. On refresh, this is copied from the source run’s archived config.source.yaml.
config.resolved.yaml YAML Config after relative path resolution. Does not include runtime-only fields (output_dir, run_name).
input_data_provenance.json JSON Pinned input-data provenance: authored path, resolved path, SHA-256, file size, mtime, row count, column count, and ordered columns.

10_validation

Present only for validation-aware runs (blocked tail, rolling origin, authored holdout, or final fit with validation provenance).

Artefact Format Description
validation_spec.json JSON Full validation provenance. See validation-spec-schema.md.

20_model_fit

Always present.

Artefact Format Description
meridian_model.binpb Protocol Buffers Serialised Meridian model (requires google-meridian[schema]).
fit_metadata.json JSON Records FitConfig values and Meridian version.

30_model_assessment

Always present. Content varies by compatibility.

Artefact Format Condition Description
diagnostics_bundle.json JSON Always Diagnostics export manifest with status of each sub-export.
predictive_accuracy.csv CSV export_predictive_accuracy: true Predictive accuracy per observation.
review_summary.json JSON export_review_summary: true Meridian review battery results.
model_results_summary.html HTML Always Meridian HTML model summary.
plots/model_fit.png PNG export_plots: true Model fit visualisation.
plots/rhat_boxplot.png PNG export_plots: true R-hat convergence diagnostic boxplot.
loo_summary.json JSON Compatible + export_model_selection: true LOO summary.
waic_summary.json JSON Compatible + export_model_selection: true WAIC summary.
loo_pointwise.csv CSV Compatible + export_model_selection: true Per-observation LOO values.
waic_pointwise.csv CSV Compatible + export_model_selection: true Per-observation WAIC values.
model_comparison.csv CSV Compatible + export_model_selection: true Ranked model comparison.
model_selection_status.json JSON Incompatible + export_model_selection: true Reason for unavailability.

40_decomposition

Always present.

Artefact Format Description
summary_metrics.nc NetCDF Full decomposition dataset with coordinates.
summary_metrics.csv CSV Flattened tabular decomposition.
plots/channel_contribution_area_chart.png PNG Channel contribution over time.
plots/contribution_waterfall_chart.png PNG Contribution waterfall breakdown.
plots/spend_vs_contribution_chart.png PNG Spend vs. contribution scatter.
plots/roi_bar_chart.png PNG ROI by channel bar chart.

60_response_curves

Present only when the response_curves YAML section is configured.

Artefact Format Description
response_curves.nc NetCDF Response curve dataset across spend multipliers.
response_curves.csv CSV Flattened tabular response curves.
plots/response_curves_plot.png PNG Response curve visualisation.

70_optimisation

Present only when the optimisation YAML section is configured.

Artefact Format Description
optimisation_summary.html HTML Meridian optimisation summary report.
optimised_data.nc NetCDF Optimised budget allocation.
optimised_data.csv CSV Tabular optimised allocation.
nonoptimised_data.nc NetCDF Baseline (non-optimised) allocation.
nonoptimised_data.csv CSV Tabular baseline allocation.
optimisation_grid.csv CSV Full optimisation grid dataset.
plots/incremental_outcome_delta_plot.png PNG Incremental outcome delta.
plots/budget_allocation_optimised_plot.png PNG Optimised allocation chart.
plots/budget_allocation_nonoptimised_plot.png PNG Baseline allocation chart.
plots/spend_delta_plot.png PNG Spend delta between optimised and baseline.
plots/optimisation_response_curves_plot.png PNG Optimisation response curves.

Reading order for analysts

For a quick assessment of a completed run:

  1. run_manifest.json — run identity, timing, stage completion
  2. 00_run_metadata/config.source.yaml — what was authored
  3. 00_run_metadata/input_data_provenance.json — dataset identity and shape
  4. 30_model_assessment/diagnostics_bundle.json — diagnostics export state
  5. 30_model_assessment/model_results_summary.html — visual model summary
  6. 40_decomposition/summary_metrics.csv — easiest tabular output to inspect

For model selection:

  1. 30_model_assessment/loo_summary.json or model_selection_status.json

For scenario analysis:

  1. 60_response_curves/response_curves.csv
  2. 70_optimisation/optimisation_summary.html

Validation spec schema reference

The validation_spec.json artefact is written to 10_validation/ for every validation-aware pipeline run. It records the concrete validation provenance for that specific run, including the holdout strategy, split geometry, and date windows.

Fields

Field Type Description
mode "validation" | "final_fit" Whether this is a validation split or the final production fit.
strategy "none" | "blocked_tail" | "rolling_origin" | "authored_holdout" Validation strategy that produced this run.
split_label str Human-readable identifier for the split (e.g. "blocked_tail", "split_01", "final_fit").
holdout_source "generated_validation" | "authored_model_spec" | "none" How the holdout mask was produced.
generated_holdout bool Whether the holdout mask was auto-generated by meridian-tools.
run_name_suffix str Suffix appended to the run name for this split.
holdout_shape list[int] | null Shape of the holdout mask array. null for final-fit runs.
train_indices list[int] Integer indices into the time axis used for training.
test_indices list[int] Integer indices into the time axis used for testing. Empty for final-fit runs.
train_dates list[str] Date values corresponding to train_indices.
test_dates list[str] Date values corresponding to test_indices. Empty for final-fit runs.

Mode and strategy combinations

Mode Strategy Holdout source Description
validation blocked_tail generated_validation Auto-generated contiguous tail holdout.
validation rolling_origin generated_validation One split from an expanding-window plan.
validation authored_holdout authored_model_spec User-provided holdout mask from YAML.
final_fit none none Full-sample production fit after validation.

Invariants

  • Validation-mode specs always have a non-null holdout_shape.
  • Final-fit specs always have holdout_shape: null, empty test_indices, and empty test_dates.
  • train_indices and train_dates always have matching lengths.
  • test_indices and test_dates always have matching lengths.
  • Authored-holdout specs have empty train_indices, test_indices, train_dates, and test_dates.

Example: blocked tail validation

{
  "mode": "validation",
  "strategy": "blocked_tail",
  "split_label": "blocked_tail",
  "holdout_source": "generated_validation",
  "generated_holdout": true,
  "run_name_suffix": "blocked_tail",
  "holdout_shape": [10],
  "train_indices": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
  "test_indices": [12, 13, 14, 15, 16, 17, 18, 19],
  "train_dates": ["2024-01-01", "2024-01-08", "..."],
  "test_dates": ["2024-03-25", "2024-04-01", "..."]
}

Example: rolling origin split

{
  "mode": "validation",
  "strategy": "rolling_origin",
  "split_label": "split_01",
  "holdout_source": "generated_validation",
  "generated_holdout": true,
  "run_name_suffix": "split_01",
  "holdout_shape": [60],
  "train_indices": [0, 1, 2, "...", 51],
  "test_indices": [52, 53, 54, 55],
  "train_dates": ["2024-01-01", "..."],
  "test_dates": ["2024-12-30", "2025-01-06", "2025-01-13", "2025-01-20"]
}

Example: final fit

{
  "mode": "final_fit",
  "strategy": "none",
  "split_label": "final_fit",
  "holdout_source": "none",
  "generated_holdout": false,
  "run_name_suffix": "final_fit",
  "holdout_shape": null,
  "train_indices": [0, 1, 2, "...", 59],
  "test_indices": [],
  "train_dates": ["2024-01-01", "...", "2025-02-24"],
  "test_dates": []
}

Note on holdout mask storage

The actual holdout mask array (boolean NumPy array) is not stored in validation_spec.json because it can be large for geo-panel models (n_geos × n_times). Only its holdout_shape is recorded. The mask is injected into the Meridian model at runtime and can be reconstructed from train_indices, test_indices, and the data geometry.

Python API

Public Python APIs exposed by meridian-tools.

Pages

Subsections of Python API

meridian_tools.config

Configuration models and YAML loading for meridian-tools.

Module: meridian_tools.config

Functions

load_yaml_config

def load_yaml_config(path: str | Path) -> MeridianToolsConfig

Load and validate a meridian-tools YAML file.

Parameters:

  • path — Path to the YAML configuration file.

Returns: A validated MeridianToolsConfig instance.

Raises: pydantic.ValidationError if the YAML content does not match the schema.

Example:

from meridian_tools.config import load_yaml_config

config = load_yaml_config("project.yml")
print(config.project.name)
print(config.data.path)
print(config.validation.strategy)

Classes

MeridianToolsConfig

class MeridianToolsConfig(BaseModel)

Full YAML configuration for one meridian-tools run. This is the top-level model returned by load_yaml_config.

Attribute Type Default
project ProjectConfig ProjectConfig()
data CsvDataConfig required
model_spec ModelSpecConfig ModelSpecConfig()
fit FitConfig FitConfig()
validation ValidationConfig ValidationConfig()
exports ExportsConfig ExportsConfig()
response_curves `ResponseCurvesConfig None`
optimisation `OptimisationConfig None`

PipelineRunConfig

@dataclass(frozen=True)
class PipelineRunConfig

Runtime options that sit outside the YAML file. Passed to run_pipeline.

Attribute Type Default Description
config_path Path required Path to the YAML config file.
output_dir Path Path("runs") Directory for run output.
run_name `str None` None
validation_spec `ValidationRunSpec None` None
apply_run_name_suffix bool True Whether to append validation-aware suffixes to the run name.
source_config_path `Path None` None

ProjectConfig

class ProjectConfig(BaseModel)
Attribute Type Default
name str "meridian-project"

CsvDataConfig

class CsvDataConfig(BaseModel)

CSV loader configuration compatible with Meridian’s CsvDataLoader.

Attribute Type Default
path Path required
kpi_type Literal["revenue", "non-revenue"] "revenue"
coord_to_columns dict[str, Any] required
media_to_channel `dict[str, str] None`
media_spend_to_channel `dict[str, str] None`
reach_to_channel `dict[str, str] None`
frequency_to_channel `dict[str, str] None`
rf_spend_to_channel `dict[str, str] None`
organic_reach_to_channel `dict[str, str] None`
organic_frequency_to_channel `dict[str, str] None`

ModelSpecConfig

class ModelSpecConfig(BaseModel)
Attribute Type Default
kwargs dict[str, Any] {}

FitConfig

class FitConfig(BaseModel)

Sampling configuration for Meridian posterior fitting.

Attribute Type Default
sample_prior_draws `PositiveInt None`
n_chains `PositiveInt list[PositiveInt]`
n_adapt PositiveInt 500
n_burnin PositiveInt 500
n_keep PositiveInt 1000
seed `int list[int]
max_tree_depth PositiveInt 10
max_energy_diff float 500.0
unrolled_leapfrog_steps PositiveInt 1
parallel_iterations PositiveInt 10

ValidationConfig

class ValidationConfig(BaseModel)

Validation and holdout orchestration settings.

Attribute Type Default
strategy Literal["none", "blocked_tail", "rolling_origin"] "none"
holdout_size `PositiveInt None`
initial_train_size `PositiveInt None`
test_size `PositiveInt None`
step_size `PositiveInt None`
max_splits `PositiveInt None`

See the validation guide for cross-field validation rules.


ExportsConfig

class ExportsConfig(BaseModel)
Attribute Type Default
use_kpi bool False
batch_size PositiveInt 1000
export_predictive_accuracy bool True
export_review_summary bool True
export_model_selection bool True
export_plots bool True

ResponseCurvesConfig

class ResponseCurvesConfig(BaseModel)
Attribute Type Default Constraint
spend_multipliers list[float] required Non-empty, all >= 0
use_posterior bool True
by_reach bool True
use_optimal_frequency bool False
confidence_level float 0.9 0 < x < 1

OptimisationConfig

class OptimisationConfig(BaseModel)
Attribute Type Default Constraint
start_date str required ISO YYYY-MM-DD
end_date str required ISO YYYY-MM-DD, >= start_date
budget OptimisationBudgetConfig required
use_posterior bool True
use_optimal_frequency bool True
confidence_level float 0.9 0 < x < 1

OptimisationBudgetConfig

class OptimisationBudgetConfig(BaseModel)
Attribute Type Default
mode Literal["fixed_total", "relative_reference_window_total"] required
value PositiveFloat required

meridian_tools.runner

Pipeline orchestration for meridian-tools.

Module: meridian_tools.runner

Functions

run_pipeline

def run_pipeline(
    run_config: PipelineRunConfig,
    *,
    progress_callback: Callable | None = None,
) -> PipelineRunResult

Execute the full meridian-tools staged pipeline.

The pipeline proceeds through the following stages in order:

  1. 00_run_metadata — Archive source and resolved configs and write input_data_provenance.json.
  2. 10_validation — Write validation spec (if validation-aware).
  3. 20_model_fit — Build input data, construct the Meridian model, sample prior and posterior.
  4. 30_model_assessment — Export diagnostics, model summary, and model selection outputs.
  5. 40_decomposition — Export summary metrics.
  6. 60_response_curves — Export response curves (if configured).
  7. 70_optimisation — Export optimisation results (if configured).

The manifest is written to disk after each stage, so a failure mid-pipeline leaves a readable partial manifest.

Before creating the dated run directory, the runner enforces three separate pre-run checks:

  1. dependency preflight (google-meridian[schema], optional plot support)
  2. validation-execution contract checks for incompatible single-run validation combinations
  3. a narrow wrapper-owned config/data preflight over the resolved input file and authored column mapping

The wrapper-owned preflight checks exactly:

  • resolved data.path exists and is a regular file
  • the CSV header row can be read
  • the parsed header is non-empty
  • no parsed header cell is blank after trimming whitespace
  • every authored scalar entry in data.coord_to_columns exists in the header
  • every authored list member in data.coord_to_columns exists in the header
  • every authored key in media_to_channel, media_spend_to_channel, reach_to_channel, frequency_to_channel, rf_spend_to_channel, organic_reach_to_channel, and organic_frequency_to_channel exists in the header
  • authored list-valued coord families are non-empty
  • authored mapping fields above are non-empty
  • supported media/RF family groups are complete when authored

Header matching is exact and case-sensitive. Anything outside this closed matrix remains Meridian-owned validation.

Parameters:

  • run_config — A PipelineRunConfig specifying the execution config path, output directory, run name, optional validation spec, and optional source_config_path for metadata archival.
  • progress_callback — Optional callable invoked on stage lifecycle events. The callback receives keyword arguments:
    • stage_name (str) — stage identifier.
    • event (str) — one of "started", "completed", "skipped", or "failed".
    • stage_index (int) — 1-based position in the pipeline.
    • stage_count (int) — total number of stages.
    • elapsed_seconds (float) — wall-clock time (present for "completed" and "failed" events).
    • message (str) — human-readable detail (present for "skipped" and "failed" events).

Returns: A PipelineRunResult with the run directory and manifest path.

Raises:

  • RuntimeError if Meridian schema support is unavailable (checked at preflight before the run directory is created).
  • RuntimeError if exports.export_plots is true but vl-convert-python is not installed (also checked at preflight).
  • ValidationExecutionContractError if the requested single-run validation execution path is incompatible with the authored config.
  • ConfigPreflightError if wrapper-owned config/data preflight fails before run-directory creation.
  • PipelineRunFailure if any exception occurs after the dated run directory already exists.

Example:

from pathlib import Path
from meridian_tools.config import PipelineRunConfig
from meridian_tools.runner import run_pipeline

result = run_pipeline(
    PipelineRunConfig(
        config_path=Path("project.yml"),
        output_dir=Path("runs"),
    )
)

print(result.run_dir)
print(result.manifest_path)

Classes

PipelineRunResult

@dataclass(frozen=True)
class PipelineRunResult

Disk locations for one completed meridian-tools run.

Attribute Type Description
run_dir Path Absolute path to the run directory.
manifest_path Path Absolute path to run_manifest.json.

ValidationExecutionContractError

class ValidationExecutionContractError(ValueError)

Raised when the requested single-run validation execution path is incompatible with the authored config. Current examples include direct rolling_origin execution through run_pipeline(...) and combining PipelineRunConfig.validation_spec with authored model_spec.kwargs.holdout_id.


ConfigPreflightError

class ConfigPreflightError(ValueError)

Raised when the wrapper-owned Phase 10 preflight fails before run-directory creation. This covers only the closed wrapper preflight boundary, not full Meridian model validation.


PipelineRunFailure

class PipelineRunFailure(RuntimeError)

Raised when a run fails after the dated run directory already exists. The original underlying exception is preserved via __cause__.

Attribute Type Description
run_dir Path Absolute failed run directory.
manifest_path Path Absolute path to the failed run manifest.
stage_name str | None Failing stage name when one is available.

Constants

Stage names

Constant Value
STAGE_RUN_METADATA "00_run_metadata"
STAGE_VALIDATION "10_validation"
STAGE_MODEL_FIT "20_model_fit"
STAGE_MODEL_ASSESSMENT "30_model_assessment"
STAGE_DECOMPOSITION "40_decomposition"
STAGE_RESPONSE_CURVES "60_response_curves"
STAGE_OPTIMISATION "70_optimisation"

PIPELINE_STAGE_ORDER

PIPELINE_STAGE_ORDER: tuple[str, ...] = (
    "00_run_metadata",
    "10_validation",
    "20_model_fit",
    "30_model_assessment",
    "40_decomposition",
    "60_response_curves",
    "70_optimisation",
)

The numbering gap at 50 is intentional, reserving space for future stages.

meridian_tools.cv

Cross-validation and holdout orchestration utilities.

Module: meridian_tools.cv

Functions

build_last_window_holdout_mask

def build_last_window_holdout_mask(
    time_index: Sequence[Any],
    holdout_size: int,
    geo_index: Sequence[Any] | None = None,
) -> np.ndarray

Build a blocked-tail holdout mask for Meridian’s holdout_id.

Returns a 1-D boolean mask for national data and a 2-D (n_geos, n_times) mask when geo_index is provided. The last holdout_size time periods are marked as True (held out).

Parameters:

  • time_index — Strictly increasing sequence of time period identifiers.
  • holdout_size — Number of tail periods to hold out. Must be positive and less than the length of time_index.
  • geo_index — Optional sequence of geo identifiers. If provided, the mask is broadcast across geos.

Returns: Boolean NumPy array.

Raises: ValueError for non-monotonic indices, undersized indices, or impossible holdout sizes.


build_rolling_origin_splits

def build_rolling_origin_splits(
    time_index: Sequence[Any],
    *,
    initial_train_size: int,
    test_size: int,
    step_size: int | None = None,
    max_splits: int | None = None,
) -> list[BlockedTimeSplit]

Create expanding-window blocked time splits for rolling-origin validation.

Parameters:

  • time_index — Strictly increasing sequence of time period identifiers.
  • initial_train_size — Size of the first training window.
  • test_size — Size of each test window.
  • step_size — Step between splits. Must equal test_size. Defaults to test_size.
  • max_splits — Maximum number of splits to generate. Must be >= 2 if set.

Returns: List of BlockedTimeSplit instances (at least 2).

Raises: ValueError for invalid parameters or if fewer than 2 splits can be generated.


build_validation_splits

def build_validation_splits(
    validation_config: ValidationConfig,
    time_index: Sequence[Any],
) -> list[BlockedTimeSplit]

Build deterministic split definitions from the typed validation config.

Dispatches to the appropriate split builder based on validation_config.strategy. Returns an empty list for strategy: none.

Parameters:

  • validation_config — A validated ValidationConfig instance.
  • time_index — Strictly increasing sequence of time period identifiers.

Returns: List of BlockedTimeSplit instances (empty for none).


build_validation_plan

def build_validation_plan(
    validation_config: ValidationConfig,
    time_index: Sequence[Any],
    geo_index: Sequence[Any] | None = None,
) -> ValidationPlan

Materialise concrete validation and final-fit run specs from one config.

For strategy: none, returns a plan with no validation runs and no final-fit run. For blocked_tail or rolling_origin, returns one ValidationRunSpec per split plus a final_fit_run spec that trains on the full time axis with no holdout.

Parameters:

  • validation_config — A validated ValidationConfig instance.
  • time_index — Strictly increasing sequence of time period identifiers.
  • geo_index — Optional sequence of geo identifiers for geo-panel models.

Returns: A ValidationPlan instance.

Example:

from meridian_tools.config import load_yaml_config
from meridian_tools.cv import build_validation_plan

config = load_yaml_config("project.yml")
plan = build_validation_plan(
    config.validation,
    time_index=["2024-01-01", "2024-01-08", "..."],
    geo_index=["US-CA", "US-NY"],
)

for run_spec in plan.validation_runs:
    print(run_spec.split_label, len(run_spec.train_indices), len(run_spec.test_indices))

if plan.final_fit_run:
    print("Final fit:", plan.final_fit_run.split_label)

Classes

BlockedTimeSplit

@dataclass(frozen=True)
class BlockedTimeSplit

One blocked time split for validation.

Attribute Type Description
label str Human-readable split label (e.g. "blocked_tail", "split_01").
train_indices tuple[int, ...] Integer indices into the time axis for training.
test_indices tuple[int, ...] Integer indices into the time axis for testing.
train_dates tuple[str, ...] Date values for training periods.
test_dates tuple[str, ...] Date values for test periods.

ValidationRunSpec

@dataclass(frozen=True)
class ValidationRunSpec

One concrete validation or final-fit run derived from a split plan. Passed to PipelineRunConfig.validation_spec to control a single pipeline execution.

Attribute Type Description
mode "validation" | "final_fit" Run mode.
strategy str Validation strategy.
split_label str Human-readable split identifier.
holdout_source str How the holdout mask was produced.
generated_holdout bool Whether the holdout was auto-generated.
holdout_id np.ndarray | None Concrete holdout mask (immutable).
train_indices tuple[int, ...] Training time indices.
test_indices tuple[int, ...] Test time indices.
train_dates tuple[str, ...] Training date values.
test_dates tuple[str, ...] Test date values.
run_name_suffix str Suffix for the run directory name.

Methods:

  • to_artifact_payload() — Returns the JSON-serialisable dictionary written to validation_spec.json.

ValidationPlan

@dataclass(frozen=True)
class ValidationPlan

Concrete validation runs and the separate final-fit run for one config.

Attribute Type Description
validation_runs tuple[ValidationRunSpec, ...] One spec per validation split.
final_fit_run ValidationRunSpec | None Full-sample final-fit spec. None for strategy: none.

meridian_tools.exports

Helpers for manifest-backed Meridian export families.

Module: meridian_tools.exports

Functions

export_model_fit_artifacts

def export_model_fit_artifacts(
    model: Any,
    output_dir: str | Path,
    *,
    fit_config: FitConfig,
    meridian_version: str | None,
) -> dict[str, Path]

Write the stable model-fit artefact set.

Produces:

  • meridian_model.binpb — Serialised Meridian model (Protocol Buffers).
  • fit_metadata.json — Records FitConfig values and Meridian version.

Parameters:

  • model — Fitted Meridian model instance.
  • output_dir — Directory to write artefacts to.
  • fit_config — The FitConfig used for this run.
  • meridian_version — Meridian version string (or None).

Returns: Dictionary mapping artefact names to file paths.


export_model_assessment_artifacts

def export_model_assessment_artifacts(
    model: Any,
    output_dir: str | Path,
    *,
    exports_config: ExportsConfig,
    diagnostics_exporter: Callable,
    model_selection_exporter: Callable,
) -> dict[str, Path]

Write the stable assessment artefact set.

Produces diagnostics bundle, model results summary HTML, and optionally model selection outputs (LOO/WAIC) and diagnostic plots.

Parameters:

  • model — Fitted Meridian model instance.
  • output_dir — Directory to write artefacts to.
  • exports_config — Export switches.
  • diagnostics_exporter — Callable for diagnostics bundle export (typically export_diagnostics_bundle).
  • model_selection_exporter — Callable for model selection export.

Returns: Dictionary mapping artefact names to file paths.


export_decomposition_artifacts

def export_decomposition_artifacts(
    model: Any,
    output_dir: str | Path,
    *,
    exports_config: ExportsConfig,
) -> dict[str, Path]

Write the stable decomposition artefact set.

Produces:

  • summary_metrics.nc — NetCDF decomposition dataset.
  • summary_metrics.csv — Flattened tabular decomposition.
  • plots/ — Channel contribution, waterfall, spend vs. contribution, and ROI charts (when export_plots: true).

Parameters:

  • model — Fitted Meridian model instance.
  • output_dir — Directory to write artefacts to.
  • exports_config — Export switches.

Returns: Dictionary mapping artefact names to file paths.


export_response_curve_artifacts

def export_response_curve_artifacts(
    model: Any,
    output_dir: str | Path,
    *,
    response_curves_config: ResponseCurvesConfig,
    exports_config: ExportsConfig,
) -> dict[str, Path]

Write the stable response-curve artefact set.

Produces:

  • response_curves.nc — NetCDF response curve dataset.
  • response_curves.csv — Flattened tabular response curves.
  • plots/response_curves_plot.png — Response curve visualisation (when export_plots: true).

Parameters:

  • model — Fitted Meridian model instance.
  • output_dir — Directory to write artefacts to.
  • response_curves_config — Response curves settings from YAML.
  • exports_config — Export switches.

Returns: Dictionary mapping artefact names to file paths.


export_optimisation_artifacts

def export_optimisation_artifacts(
    model: Any,
    output_dir: str | Path,
    *,
    optimisation_config: OptimisationConfig,
    exports_config: ExportsConfig,
) -> dict[str, Path]

Write the stable optimisation artefact set.

Produces:

  • optimisation_summary.html — Meridian optimisation summary report.
  • optimised_data.nc / .csv — Optimised budget allocation.
  • nonoptimised_data.nc / .csv — Baseline allocation.
  • optimisation_grid.csv — Full optimisation grid.
  • plots/ — Delta, allocation, spend, and response curve charts (when export_plots: true).

For budget.mode: relative_reference_window_total, the effective budget is computed as value × total_spend_in_reference_window using the model’s media and RF spend data within the start_dateend_date window.

Parameters:

  • model — Fitted Meridian model instance.
  • output_dir — Directory to write artefacts to.
  • optimisation_config — Optimisation settings from YAML.
  • exports_config — Export switches.

Returns: Dictionary mapping artefact names to file paths.


ensure_meridian_schema_support

def ensure_meridian_schema_support() -> Callable

Return Meridian’s schema serialiser or raise a stable runtime error.

Checks for meridian.schema.serde.meridian_serde.save_meridian. If the import fails, raises RuntimeError with guidance to install google-meridian[schema].

Returns: The save_meridian callable.


ensure_altair_png_support

def ensure_altair_png_support() -> Any

Return the Altair PNG backend or raise a stable runtime error.

Checks for vl_convert. If the import fails, raises RuntimeError with guidance to install vl-convert-python.

Returns: The vl_convert module.

meridian_tools.diagnostics

Diagnostics extraction and export helpers for Meridian runs.

Module: meridian_tools.diagnostics

Functions

predictive_accuracy_frame

def predictive_accuracy_frame(
    meridian_model: Any,
    *,
    use_kpi: bool = False,
    selected_geos: Sequence[str] | None = None,
    selected_times: Sequence[str] | None = None,
    batch_size: int = 1000,
) -> pd.DataFrame

Return Meridian predictive accuracy as a flat DataFrame.

Uses Meridian’s Analyzer.predictive_accuracy internally and flattens the resulting xarray dataset into a pandas DataFrame.

Parameters:

  • meridian_model — Fitted Meridian model instance.
  • use_kpi — Use KPI-based metrics.
  • selected_geos — Optional subset of geos to evaluate.
  • selected_times — Optional subset of time periods to evaluate.
  • batch_size — Batch size for Meridian analysis.

Returns: A pandas DataFrame with one row per observation.


review_summary_dict

def review_summary_dict(
    meridian_model: Any,
    *,
    selected_geos: Sequence[str] | None = None,
    selected_times: Sequence[str] | None = None,
) -> dict[str, Any]

Run Meridian’s review battery and return a JSON-ready dictionary.

Uses Meridian’s ModelReviewer internally. All non-primitive values (dataclasses, enums, NumPy arrays) are recursively converted to JSON-serialisable types.

Parameters:

  • meridian_model — Fitted Meridian model instance.
  • selected_geos — Optional subset of geos.
  • selected_times — Optional subset of time periods.

Returns: A JSON-serialisable dictionary.


export_diagnostics_bundle

def export_diagnostics_bundle(
    meridian_model: Any,
    output_dir: str | Path,
    *,
    use_kpi: bool = False,
    export_predictive_accuracy: bool = True,
    export_review_summary: bool = True,
    selected_geos: Sequence[str] | None = None,
    selected_times: Sequence[str] | None = None,
    batch_size: int = 1000,
) -> dict[str, Path]

Write predictive accuracy, review summary, and bundle manifest to disk.

The bundle manifest (diagnostics_bundle.json) records the status of each sub-export ("exported" or "disabled") along with the file name and format. This provides a stable machine-readable contract for downstream consumers.

When an export is disabled, any pre-existing file from a previous run at the same path is removed to prevent stale data.

Parameters:

  • meridian_model — Fitted Meridian model instance.
  • output_dir — Directory to write artefacts to.
  • use_kpi — Use KPI-based metrics.
  • export_predictive_accuracy — Write predictive_accuracy.csv.
  • export_review_summary — Write review_summary.json.
  • selected_geos — Not supported in current scope (raises ValueError).
  • selected_times — Not supported in current scope (raises ValueError).
  • batch_size — Batch size for Meridian analysis.

Returns: Dictionary mapping artefact names to file paths. Always includes "diagnostics_bundle". Conditionally includes "predictive_accuracy" and "review_summary".

Example:

from meridian_tools.diagnostics import export_diagnostics_bundle

artifacts = export_diagnostics_bundle(
    fitted_model,
    "output/30_model_assessment",
    export_predictive_accuracy=True,
    export_review_summary=True,
)

print(artifacts["diagnostics_bundle"])
# Path("output/30_model_assessment/diagnostics_bundle.json")

meridian_tools.model_selection

Model-selection helpers layered on top of ArviZ and Meridian.

Module: meridian_tools.model_selection

Functions

has_log_likelihood

def has_log_likelihood(candidate: Any) -> bool

Return whether the candidate exposes a non-empty log_likelihood group.

Accepts either an ArviZ InferenceData or any object with an .inference_data attribute (e.g. a fitted Meridian model).

Parameters:

  • candidate — ArviZ InferenceData or fitted Meridian model.

Returns: True if a non-empty log_likelihood group exists.


compute_loo

def compute_loo(
    candidate: Any,
    *,
    pointwise: bool = False,
    scale: str = "log",
) -> InformationCriterionResult

Compute PSIS-LOO for a Meridian model or InferenceData.

If the candidate is a fitted Meridian model without a log_likelihood group, the function automatically reconstructs it through attach_log_likelihood.

Parameters:

  • candidate — Fitted Meridian model or ArviZ InferenceData with log_likelihood.
  • pointwise — Include per-observation LOO values and Pareto k diagnostics.
  • scale — Scale for ELPD computation ("log", "negative_log", or "deviance").

Returns: An InformationCriterionResult with kind="loo".

Raises: ModelSelectionError if log-likelihood cannot be obtained.


compute_waic

def compute_waic(
    candidate: Any,
    *,
    pointwise: bool = False,
    scale: str = "log",
) -> InformationCriterionResult

Compute WAIC for a Meridian model or InferenceData.

Same automatic log-likelihood reconstruction as compute_loo.

Parameters:

  • candidate — Fitted Meridian model or ArviZ InferenceData with log_likelihood.
  • pointwise — Include per-observation WAIC values.
  • scale — Scale for ELPD computation.

Returns: An InformationCriterionResult with kind="waic".

Raises: ModelSelectionError if log-likelihood cannot be obtained.


compare_models

def compare_models(
    candidates: Mapping[str, Any],
    *,
    ic: str = "loo",
    scale: str = "log",
) -> pd.DataFrame

Compare multiple models with ArviZ compare.

Parameters:

  • candidates — Dictionary mapping model names to fitted Meridian models or InferenceData objects.
  • ic — Information criterion to use: "loo" or "waic".
  • scale — Scale for ELPD computation.

Returns: A pandas DataFrame with columns: model, rank, elpd_{ic}, p_{ic}, elpd_diff, weight, se, dse, warning, scale. Ranked by ELPD (rank 0 is best).

For a single candidate, returns a one-row DataFrame with rank=0, elpd_diff=0.0, and weight=1.0.

Raises:

  • ValueError if ic is not "loo" or "waic", or if candidates is empty.
  • ModelSelectionError if any candidate lacks log-likelihood data.

Classes

ModelSelectionError

class ModelSelectionError(RuntimeError)

Raised when information criteria cannot be computed.

Property Type Description
reason_code str | None Structured code identifying the failure reason.

Known reason codes:

Code Meaning
missing_log_likelihood_group InferenceData has no log_likelihood group and cannot be reconstructed.
holdout_fit_unsupported Model was fitted with a holdout mask.
requires_fitted_meridian_model Missing posterior samples or ArviZ InferenceData.
meridian_internal_seam_incompatible Meridian version lacks required reconstruction methods.

InformationCriterionResult

@dataclass(frozen=True)
class InformationCriterionResult

Summary of one information-criterion computation.

Attribute Type Description
kind str "loo" or "waic".
summary dict[str, Any] Summary statistics (ELPD, p, SE, etc.).
pointwise pd.DataFrame | None Per-observation values (if pointwise=True).

meridian_tools.log_likelihood

Log-likelihood computation and attachment for Meridian models.

Module: meridian_tools.log_likelihood

Functions

compute_log_likelihood_dataset

def compute_log_likelihood_dataset(
    meridian_model: Any,
) -> xr.Dataset

Compute the pointwise log-likelihood dataset for a fitted Meridian model.

This function reconstructs the joint distribution from the posterior samples and computes observation-level log-likelihood values. It handles both geo-panel and national models.

The reconstruction recovers unsaved posterior parameters (e.g. geo deviations, tau_g_excl_baseline) that Meridian does not persist to InferenceData by default.

Parameters:

  • meridian_model — A fitted Meridian model with posterior samples and a compatible posterior_sampler_callable.

Returns: An xarray Dataset with a log_likelihood variable.

Raises: ModelSelectionError if the model does not expose the required internal reconstruction seams or lacks posterior samples.


attach_log_likelihood

def attach_log_likelihood(
    meridian_model: Any,
    *,
    in_place: bool = False,
) -> az.InferenceData

Attach a log_likelihood group to a Meridian model’s InferenceData.

If the model’s InferenceData already has a non-empty log_likelihood group, it is returned as-is (or the existing InferenceData is returned for in_place=True).

Parameters:

  • meridian_model — A fitted Meridian model.
  • in_place — If True, mutates meridian_model.inference_data directly. If False (default), returns a deep copy with the log_likelihood group attached. The original model is never modified.

Returns: An ArviZ InferenceData with a log_likelihood group.

Raises:

  • ModelSelectionError with reason_code="meridian_internal_seam_incompatible" if the Meridian version lacks the required private reconstruction methods.
  • ModelSelectionError with reason_code="requires_fitted_meridian_model" if the model has no posterior samples.
  • ModelSelectionError with reason_code="holdout_fit_unsupported" if the model was fitted with a holdout mask.

Example:

from meridian_tools.log_likelihood import attach_log_likelihood

# Non-mutating (default)
idata = attach_log_likelihood(fitted_model, in_place=False)
assert hasattr(idata, "log_likelihood")

# Mutating
attach_log_likelihood(fitted_model, in_place=True)
assert hasattr(fitted_model.inference_data, "log_likelihood")

Implementation notes

The reconstruction accesses three private methods on Meridian’s posterior_sampler_callable:

  • _get_joint_dist_unpinned
  • _prepare_latents_for_reconstruction
  • _reconstruct_posteriors

These are Meridian-internal and may change without notice. If any method is missing, a ModelSelectionError with reason_code="meridian_internal_seam_incompatible" is raised instead of crashing. See the Meridian integration notes for details on this coupling boundary.

meridian_tools.lifecycle

Post-run record management: loading, listing, comparing, and refreshing runs.

Module: meridian_tools.lifecycle

Functions

resolve_run_directory

def resolve_run_directory(path: str | Path) -> Path

Return the absolute resolved run directory for a run path or manifest path.

If path points to a file, it must be named run_manifest.json; the function returns its parent directory. If path is a directory, it must contain run_manifest.json.

Parameters:

  • path — Path to a run directory or to run_manifest.json directly.

Returns: Absolute Path to the run directory.

Raises: LifecycleError if the path does not exist, is an unexpected file, or the directory does not contain run_manifest.json.


load_run_record

def load_run_record(path: str | Path) -> RunRecord

Load one run directory through the versioned lifecycle contract.

Resolves the run directory, parses the manifest, and resolves artefact paths. Required artefacts (config_source, config_resolved) must be present in the manifest and exist on disk. Manifest version 3 runs must also include input_data_provenance. Optional artefacts (validation_spec, diagnostics_bundle, model_selection_status) are resolved when present and set to None when absent.

Parameters:

  • path — Path to a run directory or to run_manifest.json directly.

Returns: A validated RunRecord instance.

Raises: LifecycleError for missing required artefacts, malformed manifests, artefact path traversal, or claimed-but-missing artefacts.


list_run_records

def list_run_records(root: str | Path) -> list[RunRecord]

Discover direct child run directories under one output root.

Scans direct child directories of root for run_manifest.json files. Returns records sorted by started_at (most recent first), with directory name as a secondary sort key.

Parameters:

  • root — Directory to scan. Must be a directory, not a file.

Returns: List of RunRecord instances.

Raises: LifecycleError if root is not a directory or if any discovered run has an invalid manifest.


build_refresh_run_config

def build_refresh_run_config(
    path: str | Path,
    *,
    output_dir: str | Path | None = None,
    run_name: str | None = None,
) -> PipelineRunConfig

Build a runtime refresh config from one stored run directory.

The execution config path points to the source run’s config.resolved.yaml. The returned PipelineRunConfig.source_config_path preserves the source run’s archived config.source.yaml so the refresh can re-copy the original YAML into the new run metadata. The output directory defaults to the source run’s parent directory (creating a sibling run). For validation runs, the validation spec is reconstructed from the stored validation_spec.json.

Parameters:

  • path — Path to the run directory or manifest to refresh.
  • output_dir — Override the output directory (default: source parent).
  • run_name — Override the run name.

Returns: A PipelineRunConfig ready for run_pipeline.

Raises: LifecycleError if the source run cannot be loaded or if authored-holdout refresh requirements are not met.


refresh_run

def refresh_run(
    path: str | Path,
    *,
    output_dir: str | Path | None = None,
    run_name: str | None = None,
) -> PipelineRunResult

Execute a non-destructive refresh run from one stored lifecycle record.

This is a convenience function that calls build_refresh_run_config followed by run_pipeline. The original run directory is never modified.

Parameters:

  • path — Path to the run directory or manifest to refresh.
  • output_dir — Override the output directory (default: source parent).
  • run_name — Override the run name.

Returns: A PipelineRunResult for the new run.


compare_run_records

def compare_run_records(
    left: str | Path,
    right: str | Path,
) -> pd.DataFrame

Compare two run records at the pinned metadata layer.

Loads both run records and compares run name, status, versions, validation spec presence, diagnostics statuses, model selection availability, and input-data provenance.

Parameters:

  • left — Path to the first run directory or manifest.
  • right — Path to the second run directory or manifest.

Returns: A pandas DataFrame with columns field, left, right, status, and changed. Rows follow a fixed order:

Row (field) Description
run_name Human-readable run name.
status Overall run status.
meridian_tools_version meridian-tools version.
meridian_version Google Meridian version.
has_validation_spec Whether a validation spec is present.
has_diagnostics_bundle Whether a diagnostics bundle is present.
predictive_accuracy_status Status from the diagnostics bundle.
review_summary_status Status from the diagnostics bundle.
has_model_selection_outputs Whether LOO/WAIC outputs are present.
model_selection_reason_code Reason code if model selection is unavailable.
input_authored_path YAML-owned data.path string.
input_resolved_path Absolute runtime input path.
input_mtime_utc Input file mtime.
input_sha256 Input file SHA-256 digest.
input_size_bytes Input file size in bytes.
input_row_count Input row count.
input_column_count Input column count.
input_ordered_columns Input CSV column order.

For provenance rows, status is "legacy_unknown" and changed is None when either run predates manifest version 3 and therefore has no stored provenance payload.

Raises: LifecycleError if either run cannot be loaded or if diagnostics or model selection artefacts are malformed.


Classes

RunRecord

@dataclass(frozen=True)
class RunRecord

Resolved lifecycle view over one on-disk run directory.

Attribute Type Description
run_dir Path Absolute path to the run directory.
manifest_path Path Absolute path to run_manifest.json.
manifest RunManifest Parsed manifest with stages, timestamps, and versions.
config_source_path Path Absolute path to config.source.yaml. Always present.
config_resolved_path Path Absolute path to config.resolved.yaml. Always present.
input_data_provenance_path Path | None Path to input_data_provenance.json. Required for manifest version 3 runs, otherwise None.
validation_spec_path Path | None Path to validation_spec.json, or None if absent.
diagnostics_bundle_path Path | None Path to diagnostics_bundle.json, or None if absent.
model_selection_status_path Path | None Path to model_selection_status.json, or None if absent.

Required attributes (config_source_path, config_resolved_path) are always present. input_data_provenance_path is present for manifest version 3 runs. Other optional attributes are None when the corresponding artefact was not produced by the run or is absent from the manifest.

Example:

from meridian_tools.lifecycle import load_run_record

record = load_run_record("runs/my-project_blocked_tail_20260402_073500")

# Required — always available
print(record.config_source_path)
print(record.config_resolved_path)

# Optional — may be None
if record.diagnostics_bundle_path:
    print(f"Diagnostics: {record.diagnostics_bundle_path}")
if record.validation_spec_path:
    print(f"Validation spec: {record.validation_spec_path}")

LifecycleError

class LifecycleError(RuntimeError)

Raised when a run directory cannot be loaded through the lifecycle contract. All lifecycle functions raise this exception type instead of generic ValueError or RuntimeError.

meridian_tools.artifacts

Manifest and JSON helpers for run artefact management.

Module: meridian_tools.artifacts

Functions

write_json

def write_json(path: str | Path, payload: Any) -> None

Write a JSON-serialisable payload to disk with UTF-8 encoding and 2-space indentation. Creates parent directories if they do not exist.


write_manifest

def write_manifest(path: str | Path, manifest: RunManifest) -> None

Serialise and write a RunManifest to disk as JSON using write_json.


normalize_artifact_paths

def normalize_artifact_paths(
    run_dir: str | Path,
    artifacts: Mapping[str, str | Path],
) -> dict[str, str]

Convert artefact paths to relative paths against run_dir so the manifest stores portable references.

Parameters:

  • run_dir — The run directory root.
  • artifacts — Mapping of artefact names to file paths.

Returns: Dictionary mapping artefact names to relative path strings.


timestamp_utc

def timestamp_utc() -> str

Return the current time as a UTC ISO-8601 string with second precision.


Classes

RunManifest

@dataclass
class RunManifest

Machine-readable summary of one meridian-tools run.

Attribute Type Default Description
run_name str required Human-readable run name.
config_path Path required Path to the authored YAML config file.
output_dir Path required Path to the run directory.
started_at str required UTC ISO-8601 start timestamp.
manifest_version int CURRENT_MANIFEST_VERSION Schema version (0, 1, 2, or 3).
status str "running" Overall run status: "running", "completed", or "failed".
finished_at str | None None UTC ISO-8601 finish timestamp. None while the run is in progress.
meridian_tools_version str __version__ Version of meridian-tools.
meridian_version str | None None Version of Google Meridian.
artifacts dict[str, str] {} Top-level artefact index. Key artefacts from stages are promoted here.
stages list[StageRecord] [] Ordered list of stage records (completed, skipped, and failed).

Class methods:

  • from_dict(payload: Mapping[str, Any]) -> RunManifest — Deserialise from a JSON-parsed dictionary. Supports manifest versions 0, 1, 2, and 3 with default values for missing fields in older versions. Raises ValueError for unsupported versions or missing required fields.

Instance methods:

  • to_dict() -> dict[str, Any] — Serialise to a JSON-compatible dictionary.

StageRecord

@dataclass
class StageRecord

One pipeline stage entry in the run manifest.

Attribute Type Default Description
name str required Stage identifier (for example, "00_run_metadata").
status str "pending" Stage status: "pending", "running", "completed", "skipped", or "failed".
started_at str | None None UTC ISO-8601 start timestamp.
finished_at str | None None UTC ISO-8601 finish timestamp.
elapsed_seconds float | None None Wall-clock seconds for stage execution.
message str | None None Human-readable message (skip reason or error detail).
artifacts dict[str, str] {} Map of artefact names to relative paths. Empty for skipped stages.

Class methods:

  • from_dict(payload: Mapping[str, Any]) -> StageRecord — Deserialise from a JSON-parsed dictionary. Raises ValueError if name is missing.

InputDataProvenance

@dataclass(frozen=True)
class InputDataProvenance

Pinned input-data provenance payload used by manifest version 3 runs.

Attribute Type Default Description
authored_path str required Exact data.path string from the source YAML.
resolved_path str required Absolute runtime path used for input loading.
sha256 str required SHA-256 digest of the resolved input file.
size_bytes int required Input file size in bytes.
mtime_utc str required Input file modification time in UTC ISO-8601 format.
row_count int required Number of CSV data rows.
column_count int required Number of CSV columns.
ordered_columns tuple[str, ...] required CSV header order.
provenance_version int INPUT_DATA_PROVENANCE_VERSION Payload schema version.

Class methods:

  • from_dict(payload: Mapping[str, Any]) -> InputDataProvenance — Validates the exact pinned Phase 09 key set and types.

Instance methods:

  • to_dict() -> dict[str, Any] — Serialise to the exact JSON payload written into input_data_provenance.json.

Constants

CURRENT_MANIFEST_VERSION

CURRENT_MANIFEST_VERSION: int = 3

SUPPORTED_MANIFEST_VERSIONS

SUPPORTED_MANIFEST_VERSIONS: tuple[int, ...] = (0, 1, 2, 3)

INPUT_DATA_PROVENANCE_VERSION

INPUT_DATA_PROVENANCE_VERSION: int = 1

REQUIRED_MANIFEST_ARTIFACTS

REQUIRED_MANIFEST_ARTIFACTS: tuple[str, ...] = (
    "config_resolved",
    "config_source",
    "input_data_provenance",
    "diagnostics_bundle",
)

These artefact entries are validated at run completion time by the runner. New runs must produce all four to complete successfully.

The lifecycle loader enforces config_source and config_resolved as required for all supported manifests. It also enforces input_data_provenance for manifest version 3 runs. diagnostics_bundle remains optional, so older or partial runs can still be loaded without it.

Concepts

Background material on architecture, design decisions, and Meridian integration boundaries.

Pages

  • Architecturemeridian-tools is a companion package designed for agency teams that use Google Meridian as their client-facing MMM (Marketing Mix Modelling) engine. It provides a stricter, more reproducible workflow around Meridian without forking the upstream library.
  • Design decisions — This document records the key design decisions in meridian-tools and the reasoning behind them. It is intended for maintainers and contributors who need to understand why things are built the way they are.
  • Meridian integration — This document describes how meridian-tools integrates with Google Meridian, the boundaries of that integration, and the risks associated with different coupling levels.

Subsections of Concepts

Architecture

meridian-tools is a companion package designed for agency teams that use Google Meridian as their client-facing MMM (Marketing Mix Modelling) engine. It provides a stricter, more reproducible workflow around Meridian without forking the upstream library.

Core philosophy

  1. No forkingmeridian-tools strictly wraps Meridian. It does not modify Meridian’s internal code or model implementations.
  2. Reproducibility — All runs are driven by typed YAML configurations, ensuring that models can be perfectly reproduced.
  3. Structured workflow — The package enforces a staged execution pipeline (validation, model fit, assessment, decomposition, response curves, optimisation).
  4. Lifecycle management — Runs are treated as immutable artefacts with rich metadata, allowing for easy comparison, refreshing, and storage.

Module map

meridian_tools/
├── __init__.py          Lazy-loading package exports
├── artifacts.py         Manifest and JSON helpers
├── cli.py               CLI entry point (argparse)
├── config.py            Pydantic YAML models
├── cv.py                Validation split logic
├── demo.py              Bundled demo discovery
├── diagnostics.py       Diagnostics export
├── exports.py           Meridian analysis surface wrappers
├── launcher.py          Run execution wrapper
├── lifecycle.py         Post-run record management
├── log_likelihood.py    Log-likelihood reconstruction adapter
├── model_selection.py   ArviZ LOO/WAIC wrappers
├── terminal.py          CLI presentation and warning grouping
└── version.py           Static version

Layered import design

Meridian and TensorFlow are never imported at module level in the configuration, validation, or CLI layers. This means lightweight operations respond instantly:

Operation Imports loaded
meridian-tools --help pydantic, yaml
load_yaml_config(path) pydantic, yaml
build_validation_plan(...) numpy
run_pipeline(...) Everything (Meridian, TF, ArviZ, etc.)

The __init__.py uses __getattr__-based lazy loading so that import meridian_tools does not trigger heavy dependency imports.

Pipeline execution model

The runner executes stages sequentially. Each stage:

  1. Creates a StageRecord and appends it to the in-memory manifest.
  2. Calls the stage function, which returns a dict[str, Path] of artefacts.
  3. Normalises artefact paths to be relative to the run directory.
  4. Writes the updated manifest to disk.

This design means a crash mid-pipeline leaves a readable partial manifest on disk. The last entry in the stages array is the last successfully completed stage.

┌─────────────────────┐
│  00_run_metadata    │  Archive source + resolved configs
├─────────────────────┤
│  10_validation      │  Write validation spec (if applicable)
├─────────────────────┤
│  20_model_fit       │  Build data → build model → sample posterior
├─────────────────────┤
│  30_model_assessment│  Diagnostics + model selection + summary
├─────────────────────┤
│  40_decomposition   │  Summary metrics (NetCDF + CSV)
├─────────────────────┤
│  60_response_curves │  Response curves (if configured)
├─────────────────────┤
│  70_optimisation    │  Budget optimisation (if configured)
└─────────────────────┘

The numbering gap at 50 reserves space for future stages without renumbering.

Configuration architecture

The separation between authored YAML and runtime-only config is strict:

  • MeridianToolsConfig — Pydantic model for the YAML file. Owns project metadata, data paths, model spec, fit settings, validation strategy, and export switches.
  • PipelineRunConfig — Frozen dataclass for runtime options. Owns output directory, run name, and concrete validation spec.

The runner writes two config copies to each run directory:

  • config.source.yaml — Verbatim copy of the input YAML.
  • config.resolved.yaml — After relative path resolution. Never includes runtime-only fields.

Artefact path normalisation

All artefact paths in manifests are stored relative to the run directory through normalize_artifact_paths. This makes run directories portable across machines. The lifecycle layer resolves them back to absolute paths at load time.

Meridian coupling boundaries

Coupling level Modules Surface used
Public API runner.py, exports.py Meridian, ModelSpec, CsvDataLoader, Analyzer, Summarizer, BudgetOptimizer
Semi-public log_likelihood.py, exports.py model_context, inference_data, input_data
Private log_likelihood.py _get_joint_dist_unpinned, _prepare_latents_for_reconstruction, _reconstruct_posteriors

The private-API coupling is confined to log_likelihood.py and wrapped in comprehensive error handling. See Meridian integration for details.

Data flow

  1. Input — A typed YAML file defines the entire run scope.
  2. Initialisation — The runner resolves the config and creates a timestamped run directory.
  3. Execution — The pipeline steps through stages, maintaining a central state dictionary with the fitted model and intermediate results.
  4. Export — Each stage writes specific artefacts to disk within the run directory.
  5. Finalisation — The manifest is completed with status: "completed" and finished_at, locking the run state.
  6. Lifecycle — Downstream processes or analysts consume artefacts or use lifecycle tools to compare, refresh, or audit runs.

Design decisions

This document records the key design decisions in meridian-tools and the reasoning behind them. It is intended for maintainers and contributors who need to understand why things are built the way they are.

No IID cross-validation

Decision: meridian-tools does not implement random-shuffle or naive k-fold cross-validation.

Reasoning: MMM data is time series. Random IID splits break temporal structure, leading to data leakage where future observations inform training and past observations appear in the test set. This produces optimistic accuracy estimates that do not reflect real-world forecasting performance.

The package provides two time-respecting alternatives:

  • Blocked tail — reserves the most recent observations as a single test block.
  • Rolling origin — expanding-window forward-chaining that respects temporal ordering at every split.

Non-overlapping rolling-origin test windows

Decision: step_size must equal test_size for rolling-origin splits.

Reasoning: Overlapping test windows would mean the same observation appears in multiple test sets. This violates the independence assumption needed for comparing validation scores across splits and complicates the interpretation of aggregate metrics. Non-overlapping windows ensure each observation is evaluated exactly once across the split plan.

Minimum two splits for rolling origin

Decision: build_rolling_origin_splits requires at least two splits.

Reasoning: A single rolling-origin split is functionally identical to a blocked-tail holdout and provides no comparative signal. If your data only supports one split, use blocked_tail instead — it communicates the intent more clearly.

Holdout restriction for model selection

Decision: LOO and WAIC are only available for models where holdout_id is None.

Reasoning: LOO and WAIC estimate expected log predictive density (ELPD) using the full observed likelihood surface. A model fitted with a holdout mask has a modified likelihood that excludes held-out observations. Computing LOO on this truncated likelihood would produce ELPD estimates that are not comparable to those from full-sample fits.

The correct workflow is:

  1. Use validation splits for candidate evaluation.
  2. Select the best specification based on holdout performance.
  3. Refit the chosen specification on the full dataset.
  4. Compute LOO/WAIC on the full-sample fit for model quality reporting.

Separation of validation fits and final fits

Decision: Validation runs and final production fits are separate pipeline executions that produce separate run directories.

Reasoning: A validation fit is trained on a subset of the data. Its posterior reflects that subset and should not be used as the production artefact. Keeping them as separate runs prevents accidental use of a validation fit for downstream analysis or reporting.

Lazy imports for CLI responsiveness

Decision: Heavy dependencies (TensorFlow, NumPy, Meridian, ArviZ) are not imported at module level in the config, CLI, or validation layers.

Reasoning: TensorFlow alone takes several seconds to import. The CLI must respond instantly for --help and --list operations. The __init__.py uses __getattr__-based lazy loading, and the test suite verifies that build_parser() only loads pydantic and yaml.

Pydantic extra="forbid" everywhere

Decision: All configuration models reject unexpected keys.

Reasoning: Silent acceptance of unknown keys is a common source of misconfiguration in YAML-driven tools. A typo like export_pridictive_accuracy would be silently ignored without extra="forbid", leading to unexpected default behaviour. Strict rejection catches these errors at config load time with clear error messages.

Relative artefact paths in manifests

Decision: All artefact paths in run_manifest.json are stored relative to the run directory.

Reasoning: Absolute paths would tie run directories to a specific machine or filesystem layout. Relative paths make run directories portable — they can be copied, archived, or moved between machines without breaking the manifest contract.

Non-destructive lifecycle operations

Decision: refresh_run creates a new sibling directory rather than overwriting the source.

Reasoning: Overwriting a validated production run would destroy the audit trail. Creating a sibling preserves the original for comparison and rollback. The lifecycle layer explicitly validates that source directories are not mutated by refresh operations.

Manifest-per-stage persistence

Decision: The manifest is written to disk after each stage completes, not only at the end of the pipeline.

Reasoning: MCMC sampling can run for minutes to hours. If the process crashes or is killed during a later stage, the partial manifest on disk reflects what completed successfully. This aids debugging and allows partial runs to be inspected without special tooling.

Stage numbering with gaps

Decision: Pipeline stages use numbers 00, 10, 20, 30, 40, 60, 70 with a gap at 50.

Reasoning: The gaps allow future stages to be inserted at natural positions (e.g. a stage 50 for custom analysis) without renumbering existing stages. Renumbering would break backward compatibility with stored manifests and any downstream tooling that references stage names.

Config source vs. resolved archival

Decision: Both the verbatim source YAML and the resolved YAML are archived in every run directory.

Reasoning: The source YAML shows what the analyst authored (including relative paths). The resolved YAML shows the runtime interpretation (absolute paths, defaults applied). Both are needed for reproducibility:

  • The source is needed to understand intent.
  • The resolved config is needed to reproduce the exact execution.

Runtime-only fields (output_dir, run_name, validation_spec) are deliberately excluded from the resolved config because they are not part of the reproducible model specification.

Structured model selection errors

Decision: Model selection failures produce ModelSelectionError with a machine-readable reason_code rather than generic exceptions.

Reasoning: The pipeline needs to distinguish between “model selection is not possible for this run type” (expected) and “something is broken” (unexpected). Structured reason codes allow:

  • The runner to write model_selection_status.json without failing the run.
  • The lifecycle layer to compare model selection availability across runs.
  • Downstream consumers to programmatically handle different failure modes.

Meridian integration

This document describes how meridian-tools integrates with Google Meridian, the boundaries of that integration, and the risks associated with different coupling levels.

Integration philosophy

meridian-tools wraps Meridian without forking it. Meridian remains the modelling engine; meridian-tools adds workflow orchestration, validation, diagnostics bundling, model selection, and lifecycle management on top.

This approach means:

  • Meridian upgrades can be adopted without merging fork changes.
  • The upstream project’s API stability directly affects meridian-tools.
  • Any use of Meridian-internal APIs must be explicitly managed.

Coupling levels

Public API (low risk)

These are documented, versioned Meridian surfaces:

Surface Used by
Meridian (model class) runner.py
ModelSpec runner.py
CsvDataLoader, CoordToColumns runner.py
Analyzer exports.py, diagnostics.py
Summarizer exports.py
BudgetOptimizer exports.py
ModelReviewer diagnostics.py
MediaEffects, MediaSummary, ModelDiagnostics, ModelFit exports.py
save_meridian (schema serde) exports.py

These are unlikely to break without a Meridian major version bump. The exact google-meridian==1.5.3 pin keeps these assumptions aligned with the validated release baseline.

Semi-public API (medium risk)

These are accessible attributes on Meridian model objects that are used but not formally documented as stable:

Surface Used by Purpose
model.inference_data log_likelihood.py, model_selection.py Access ArviZ InferenceData
model.model_context log_likelihood.py, exports.py Access model structure
model.input_data exports.py Access input data for spend computation
model.posterior_sampler_callable log_likelihood.py Access posterior sampler

These are stable in practice (they are used by Meridian’s own analysis surfaces) but are not guaranteed to be stable across versions.

Private API (high risk)

These are _-prefixed methods on Meridian’s posterior_sampler_callable, used exclusively in log_likelihood.py for log-likelihood reconstruction:

_get_joint_dist_unpinned
_prepare_latents_for_reconstruction
_reconstruct_posteriors

These methods are Meridian-internal and may change or be removed in any Meridian release, including patch versions. They are necessary because Meridian does not provide a public API for pointwise log-likelihood computation.

Risk mitigation

Compatibility guard

log_likelihood.py checks for the presence of all three private methods before attempting reconstruction:

required_sampler_methods = (
    "_get_joint_dist_unpinned",
    "_prepare_latents_for_reconstruction",
    "_reconstruct_posteriors",
)
if any(not hasattr(posterior_sampler, method) for method in required_sampler_methods):
    raise ModelSelectionError(
        "...",
        reason_code="meridian_internal_seam_incompatible",
    )

If any method is missing, the error is caught and recorded as a model_selection_status.json artefact with reason_code: meridian_internal_seam_incompatible. The rest of the pipeline continues normally.

Graceful degradation

Model selection incompatibility is non-fatal at every level:

  1. log_likelihood.py raises ModelSelectionError with a structured code.
  2. model_selection.py propagates the error.
  3. runner.py catches it, writes model_selection_status.json, and continues.
  4. The manifest records the assessment stage as completed.
  5. The lifecycle layer can inspect model_selection_status to understand why model selection was unavailable.

Version pinning

The pyproject.toml pins Meridian to google-meridian[schema]==1.5.3. Any Meridian upgrade must refresh the private log-likelihood reconstruction baseline before the version guard is relaxed.

Integration testing

The test suite includes a gated live Meridian verification command:

MERIDIAN_TOOLS_ENABLE_REAL_FIT=1 pytest tests/test_demo_integration.py::test_real_pipeline_refresh_smoke tests/test_log_likelihood.py::test_compute_log_likelihood_dataset_real_posterior_smoke -m real_fit -v

This command proves two different real seams:

  • one reduced real pipeline run over bundled demo data, including stored-run refresh after the original YAML is removed
  • the lower-level live log-likelihood reconstruction path

It is excluded from the default test suite because it requires real MCMC sampling, but it should be run after every Meridian version upgrade.

Constants dependency

log_likelihood.py uses Meridian constants for posterior parameter names:

from meridian import constants
# constants.BETA_GM, constants.TAU_G, constants.ETA_M, etc.

These are stable string constants but are not versioned. A Meridian release that renames these constants would cause import-time failures.

Unsaved posterior parameter recovery

Meridian does not persist all posterior parameters to InferenceData. The _recover_unsaved_state function in log_likelihood.py reconstructs:

  • tau_g_excl_baseline — Recovered from the posterior’s tau_g variable by slicing out the baseline geo index (concatenating the elements before and after baseline_geo_idx).
  • Geo deviations — Recovered from the posterior by solving deviation = (target - base) / scale for normal effects, or deviation = (log(target) - base) / scale for log-normal effects, with a scale == 0 guard that maps to zero.

This recovery is mathematically correct for the supported model families (log-normal and normal media effects). It is tested against both geo-panel and national models in test_log_likelihood.py.

What breaks on a Meridian upgrade

Change type Impact Detection
Public API signature change runner.py, exports.py break Default test suite
Semi-public attribute rename log_likelihood.py, exports.py break Default test suite
Private method removal/rename Model selection disabled Live smoke test or model_selection_status.json
Constant rename Import-time failure Default test suite
New posterior parameter Log-likelihood may be incorrect Manual review + live smoke test
Changed likelihood formula Log-likelihood may be incorrect Live smoke test
  1. Pin the new Meridian version in a branch.
  2. Run the full default test suite: pytest tests/ -v.
  3. Run the live Meridian verification command: MERIDIAN_TOOLS_ENABLE_REAL_FIT=1 pytest tests/test_demo_integration.py::test_real_pipeline_refresh_smoke tests/test_log_likelihood.py::test_compute_log_likelihood_dataset_real_posterior_smoke -m real_fit -v.
  4. If model selection breaks, check model_selection_status.json for the reason code.
  5. If private methods changed, update log_likelihood.py to match the new Meridian internals or accept graceful degradation.
  6. Update docs/project/release-baseline.md with the new verified state.

Project

Contributor-facing project documentation, release baselines, and changelog material.

Pages

  • Contributing — This guide covers the development setup, conventions, and workflow for contributing to meridian-tools.
  • Acceptance checklist — Use this page as the canonical local acceptance checklist for the current repository state. Run the commands in this order. The acceptance gate is local and command-driven. It does not depend on CI, GitHub Actions, or unpublished helper scripts.
  • Release baseline — This page records the current milestone release baseline for the repository. Treat it as a validated project state, not as an automated release system. The baseline uses the same local command sequence as the acceptance checklist and records the observed warning profile, the direct runtime dependency bounds, and the accepted trade-offs that still shape the package.
  • Changelog — All notable changes to meridian-tools are documented in this file.

Subsections of Project

Contributing

This guide covers the development setup, conventions, and workflow for contributing to meridian-tools.

Development setup

Clone and install

git clone <repo-url> meridian-tools
cd meridian-tools
pip install -e ".[dev]"

The [dev] extra installs pytest, ruff, and mypy.

Verify the install

meridian-tools --help
python -m compileall src tests
ruff check src tests
mypy src
pytest tests/ -v

Acceptance gate

Before submitting any change, run the full acceptance sequence from the repository root:

python -m compileall src tests
ruff check src tests
ruff format --check src tests
mypy src
python -m pip install -e . --no-deps
meridian-tools --help
pytest tests/ -v

See acceptance.md for the expected results and how to interpret failures.

Code style

Formatting and linting

The project uses Ruff for both linting and formatting:

# Check
ruff check src tests
ruff format --check src tests

# Auto-fix
ruff check --fix src tests
ruff format src tests

Configuration is in pyproject.toml:

[tool.ruff]
line-length = 120
target-version = "py311"

[tool.ruff.lint]
select = ["E", "F", "I", "UP", "B", "C90", "SIM", "RUF"]

Type annotations

All public functions and classes use type annotations. The codebase uses from __future__ import annotations for forward-reference support.

Import conventions

  • Standard library imports first, then third-party, then local.
  • Heavy dependencies (Meridian, TensorFlow, ArviZ) are imported lazily inside functions, not at module level, in the config/CLI/validation layers.
  • Ruff rule I enforces import sorting.

Configuration models

All Pydantic models use ConfigDict(extra="forbid"). New config fields must be added with appropriate types, defaults, and validators.

Testing

Running tests

# Full suite
pytest tests/ -v

# Specific file
pytest tests/test_runner.py -v

# Specific test
pytest tests/test_runner.py::test_run_pipeline_writes_manifest -v

Test conventions

  • Tests use pytest with tmp_path for temporary directories.
  • monkeypatch is used extensively to mock Meridian internals and isolate unit tests from real MCMC sampling.
  • Module-scoped fixtures (scope="module") are used for expensive model construction in test_log_likelihood.py and test_model_selection.py.
  • Shared test infrastructure is defined inline in individual test modules. There is no top-level conftest.py.

Live Meridian verification

One opt-in command exercises the bounded real Meridian seam:

MERIDIAN_TOOLS_ENABLE_REAL_FIT=1 pytest tests/test_demo_integration.py::test_real_pipeline_refresh_smoke tests/test_log_likelihood.py::test_compute_log_likelihood_dataset_real_posterior_smoke -m real_fit -v

This is not part of the default suite. It proves one reduced real pipeline run over bundled demo data, one stored-run refresh after the original YAML is removed, and the lower-level live log-likelihood seam. Run it after Meridian version upgrades and before release-candidate handoff when you want extra confidence beyond the fast suite.

Writing new tests

  • Place tests in the appropriate tests/test_<module>.py file.
  • Use monkeypatch to avoid real MCMC sampling in unit tests.
  • Test both success paths and error conditions.
  • Verify artefact file contents, not just their existence.
  • Use tmp_path for all filesystem operations.

Project structure

meridian-tools/
├── src/meridian_tools/       # Package source
│   ├── __init__.py           # Lazy-loading exports
│   ├── artifacts.py          # Manifest helpers
│   ├── cli.py                # CLI entry point
│   ├── config.py             # Pydantic models
│   ├── cv.py                 # Validation splits
│   ├── demo.py               # Demo discovery
│   ├── diagnostics.py        # Diagnostics export
│   ├── exports.py            # Meridian export wrappers
│   ├── launcher.py           # Run execution wrapper
│   ├── lifecycle.py          # Post-run management
│   ├── log_likelihood.py     # Log-likelihood adapter
│   ├── model_selection.py    # LOO/WAIC wrappers
│   ├── terminal.py           # CLI presentation
│   └── version.py            # Static version
├── tests/                    # Test suite
│   ├── _demo_data/           # Bundled demo data (packaged)
├── docs/                     # Documentation
├── runme.py                  # Source-tree demo launcher
└── pyproject.toml            # Build and dependency config

Versioning

The version is defined in src/meridian_tools/version.py:

__version__ = "0.3.0"

Version bumps are manual edits. Update this file when preparing a release.

Documentation

Documentation lives in docs/. When adding new features:

  1. Update relevant guide or reference pages.
  2. Add API documentation for new public functions or classes.
  3. Update the YAML schema reference if config fields changed.
  4. Update the output schema if new artefacts are produced.

Common pitfalls

  • Do not import Meridian at module level in config, CLI, or validation modules. This breaks CLI responsiveness.
  • Do not add extra="allow" to Pydantic models. The extra="forbid" policy prevents silent misconfiguration.
  • Do not modify source run directories in lifecycle operations. Always create new sibling directories.
  • Do not weaken or delete existing tests without explicit direction.

Acceptance checklist

Use this page as the canonical local acceptance checklist for the current repository state. Run the commands in this order. The acceptance gate is local and command-driven. It does not depend on CI, GitHub Actions, or unpublished helper scripts.

Acceptance gate

Run the following commands from the repository root:

python -m compileall src tests
ruff check src tests
ruff format --check src tests
mypy src
python -m pip install -e . --no-deps
meridian-tools --help
pytest tests/ -v

The canonical acceptance-gate result for the last command is:

244 passed, 2 skipped

That result is the pass or fail line for the default local acceptance gate. The recorded warning profile belongs to the release baseline, not to the acceptance-gate definition itself.

What each command proves

python -m compileall src tests proves that the checked-in Python files parse cleanly. If this step fails, you are dealing with a syntax or import-time parse issue and you should stop there.

ruff check src tests proves that the repository still satisfies the pinned lint rules. If this step fails, fix the reported lint violations before moving on.

ruff format --check src tests proves that the checked-in files still match the agreed formatting contract. If this step fails, run the formatter and then rerun the verification sequence.

mypy src proves that the configured static typing baseline still runs cleanly. If this step fails, either fix the reported type issue or update the documented ratchet intentionally.

python -m pip install -e . --no-deps proves that the package still builds and installs in editable mode from the local source tree. If this fails, treat it as a packaging or build-metadata break rather than a test-only problem.

meridian-tools --help proves that the published CLI entrypoint still resolves and that the lightweight command surface still imports cleanly. If this step fails, check the package entrypoint and import boundary before continuing.

pytest tests/ -v proves the behavioural contract of the repository. This is the broadest local validation step. If it fails, use the failing test names to identify which package contract regressed.

How to interpret failure

If the compile step fails, fix syntax or parse problems first. The later steps will not give you useful signal until that is resolved.

If lint, format, or type checks fail, treat that as a source-tree quality issue, not as an optional clean-up item. Bring the tree back to the pinned Ruff and mypy state before trusting the rest of the loop.

If editable install fails, treat the repository as not ready for contributor handoff. The package must install cleanly before the test result matters.

If CLI help fails, assume the published command surface is broken even if the Python modules still import manually.

If pytest tests/ -v fails, the acceptance gate is not met. A partial pass is not enough. Fix the failing behavioural contract and rerun the full command sequence.

Optional extra confidence

The repository also carries one opt-in live Meridian verification command for extra technical confidence:

MERIDIAN_TOOLS_ENABLE_REAL_FIT=1 pytest tests/test_demo_integration.py::test_real_pipeline_refresh_smoke tests/test_log_likelihood.py::test_compute_log_likelihood_dataset_real_posterior_smoke -m real_fit -v

This command is not part of the default blocking acceptance gate. It exists to provide one bounded live Meridian route that proves:

  • real pipeline execution over bundled demo data
  • manifest-backed stored-run refresh after the original YAML is removed
  • the lower-level live log-likelihood reconstruction seam

On the reference development environment, the recorded run finished in 185.42 seconds (0:03:05); keep a budget of roughly six minutes or less for this extra-confidence command.

Release baseline

This page records the current milestone release baseline for the repository. Treat it as a validated project state, not as an automated release system. The baseline uses the same local command sequence as the acceptance checklist and records the observed warning profile, the direct runtime dependency bounds, and the accepted trade-offs that still shape the package.

Release-ready definition in this repository

The repository is release-ready only when the documented local acceptance command set passes, pytest tests/ -v returns the recorded pass/skip count below, the same validated run is recorded with the observed warning count, the warning categories match the accepted ones below, and the accepted trade-offs remain explicit rather than hidden.

Validated baseline record

The current verified local baseline is:

python -m compileall src tests
ruff check src tests
ruff format --check src tests
mypy src
python -m pip install -e . --no-deps
meridian-tools --help
pytest tests/ -v
-> 244 passed, 2 skipped, 60 warnings

The optional extra-confidence live path remains separate:

MERIDIAN_TOOLS_ENABLE_REAL_FIT=1 pytest tests/test_demo_integration.py::test_real_pipeline_refresh_smoke tests/test_log_likelihood.py::test_compute_log_likelihood_dataset_real_posterior_smoke -m real_fit -v
-> 2 passed, 47 warnings in 185.42s (0:03:05)

That command remains opt-in local confidence, not the default developer loop or silent CI policy. On the reference development environment, the recorded run finished in 185.42 seconds (0:03:05); keep a budget of roughly six minutes or less for ordinary local execution.

Runtime dependency boundary

The current runtime boundary recorded from pyproject.toml is:

  • requires-python >=3.11
  • google-meridian==1.5.3
  • arviz>=0.18.0,<0.20.0
  • pandas>=2.2.0,<3
  • pydantic>=2.8.0,<3
  • PyYAML>=6.0.0,<7

These are the direct runtime dependency bounds for the milestone baseline. This page does not imply broader environment reproducibility than the repository currently implements.

Accepted warning profile

The recorded 60 warnings are accepted in the current milestone baseline. They fall into two pinned categories:

  • Meridian model / prior warnings
  • ArviZ model-selection warnings

This baseline does not pretend the repository is warning-free. It records the current observed warning profile honestly and treats those warning categories as accepted for the present milestone.

Accepted trade-offs

The current release baseline also depends on several explicit trade-offs.

The package takes a no-fork Meridian approach. We keep Meridian as the modelling engine and add workflow and compatibility tooling around it rather than modifying Meridian source.

Bayesian model selection remains intentionally limited to fitted Meridian models where holdout_id is None. Validation-fit and authored-holdout runs are not treated as compatible LOO or WAIC candidates.

Lifecycle tooling remains Python-first. The repository does not currently ship a broader lifecycle CLI.

Version bumping remains a manual edit rather than a fully automated release pipeline.

Boundary of this record

This page records one validated milestone state. It does not introduce CI as the source of truth. It does not define publish automation. It does not promise zero warnings. It does not claim a broader release process than the repository actually supports today.

Changelog

All notable changes to meridian-tools are documented in this file.

The format is based on Keep a Changelog.

[Unreleased]

[0.3.0] — 2026-04-24

Changed

  • CLI single source of truthrunme.py now delegates directly to meridian_tools.cli, removing duplicate root-level argument parsing.
  • Typed runner state — Pipeline orchestration now uses PipelineContext for shared stage state.
  • Shared posterior sampling — Runner posterior sampling keyword mapping is centralized in one helper.
  • Lifecycle comparison schema — Run comparison rows are generated from declarative comparison field descriptors.
  • Meridian compatibility pin — The package pins google-meridian[schema]==1.5.3, and log-likelihood reconstruction refuses unvalidated Meridian versions.
  • Static analysis tooling — Development extras now include mypy, and Ruff enables additional complexity, simplification, and Ruff-specific rule families.

Fixed

  • Optimized Python safety — Validation helpers now use explicit exceptions instead of assert for runtime invariants.
  • Shared confidence validation — Response curve and optimisation configs share one confidence_level validator.
  • Export coercion documentation — NetCDF attribute coercion now documents its input-to-output type mapping.

[0.2.0] — 2026-04-07

Added

  • Docs site build — Hugo-based website documentation under docs-site/, generated from the repository Markdown set by docs-site/build_content.py.
  • Manifest v3 provenance — Explicit input_data_provenance capture for stored runs and lifecycle refresh or compare workflows.
  • Typed failure boundariesConfigPreflightError, ValidationExecutionContractError, and PipelineRunFailure distinguish wrapper-owned preflight, validation contract misuse, and post-directory runtime failures.
  • Bounded live verification — An opt-in Meridian real-fit smoke route gated behind MERIDIAN_TOOLS_ENABLE_REAL_FIT=1.
  • Module-path CLI contract — Explicit support and regression coverage for python -m meridian_tools.cli ....

Changed

  • Shared launch flowmeridian-tools and the repo-root runme.py launcher now share one launch flow for config loading, preflight checks, progress reporting, and terminal success or failure output.
  • Packaged demo assets — Bundled demo configs and datasets are resolved from packaged _demo_data, so demo runs work from installed wheels as well as source checkouts.
  • Default demo fit mode — Bundled demos now default to full-sample fits (validation.strategy: none), so loo_summary.json and waic_summary.json are generated by default and 10_validation is recorded as skipped.
  • Refresh contract — Stored-run refresh now reloads from the saved resolved config while preserving the original source config copy in run metadata.
  • Lifecycle compare semantics — Compare now distinguishes legacy runs without dataset provenance from real dataset changes.
  • Documentation layout — Public documentation is reorganised under docs/ into getting-started, guides, reference, concepts, and project sections.

Fixed

  • Structured public entrypoint failures — Missing or invalid config paths in public entrypoints now produce structured failure output instead of raw Python tracebacks unless --traceback is used.
  • Relative-path refresh — Refreshing a stored run with relative data.path input no longer depends on the original source config location remaining present on disk.
  • Partial-run failure reporting — Failed runs that already created an output directory now report the concrete run directory, manifest path, and failing stage through the CLI and runme.py.
  • Docs-site theme resolution — Hugo builds resolve the Relearn theme through a pinned module dependency instead of requiring a local theme checkout.

[0.1.0] — 2026-04-02

Added

  • Typed YAML configuration — Pydantic-validated config with extra="forbid" strictness for all sections: project, data, model_spec, fit, validation, exports, response_curves, optimisation.
  • Staged pipeline runner — Sequential execution through 00_run_metadata, 10_validation, 20_model_fit, 30_model_assessment, 40_decomposition, 60_response_curves, 70_optimisation with manifest persistence after each stage.
  • Validation orchestrationblocked_tail and rolling_origin time-series validation strategies with auto-generated holdout masks. Authored holdout passthrough through model_spec.kwargs.holdout_id.
  • Diagnostics bundlingdiagnostics_bundle.json manifest with optional predictive_accuracy.csv and review_summary.json exports.
  • Bayesian model selection — Compatibility-aware LOO and WAIC computation through ArviZ, with automatic log-likelihood reconstruction for fitted Meridian models. Graceful degradation for incompatible runs through structured ModelSelectionError with reason codes.
  • Response curves export — Configurable spend multiplier grid with NetCDF and CSV outputs.
  • Optimisation export — Fixed-budget and relative-budget optimisation with full artefact set including allocation charts.
  • Plot exports — PNG plot artefacts through Altair/vl-convert for model fit, diagnostics, decomposition, response curves, and optimisation stages.
  • Lifecycle managementload_run_record, list_run_records, build_refresh_run_config, compare_run_records for post-run analysis and reproducible refresh workflows.
  • CLImeridian-tools run and meridian-tools demo subcommands with lightweight imports for fast startup.
  • Bundled demostimeseries and geo_panel reference workflows with packaged data and configs.
  • Manifest versioning — Support for manifest versions 0, 1, and 2 with backward-compatible deserialisation.
  • Comprehensive test suite — 218 tests across 15 test files covering configuration, validation, pipeline execution, exports, diagnostics, model selection, lifecycle, and demos.