Companion tooling for Google Meridian MMM workflows. This documentation covers
installation, configuration, validation strategies, model selection, lifecycle
management, and the full API reference.
Getting started
Installation — prerequisites, install,
and verification
From the source tree (recommended for development)
cd /path/to/meridian-tools
pip install -e ".[dev]"
The [dev] extra installs pytest, ruff, and mypy for running the test suite and
linter.
Editable install without dev extras
pip install -e .
Verify the install
meridian-tools --help
You should see the CLI help output listing the run and demo subcommands.
This command is deliberately lightweight — it does not import TensorFlow, NumPy,
or Meridian.
This guide takes you from a fresh install to your first completed run in under
five minutes using the bundled demo data.
1. Run a bundled demo
List the available demos:
meridian-tools demo --list
Output:
timeseries
geo_panel
Run the timeseries demo:
meridian-tools demo timeseries
When run from the source checkout, this creates a dated run directory under
runs/demos/. When run from an installed package, the default output root is
./runs/demos/ relative to your current working directory. Each demo produces
a full staged output layout.
2. Inspect the run directory
After the demo completes, find the created run directory:
ls runs/demos/
You will see a directory like demo-timeseries_20260402_073500/.
The name comes from the demo’s project.name (demo-timeseries) plus a
timestamp. The bundled demos now default to full-sample fits, so LOO and WAIC
outputs are available in the assessment stage by default. Inside:
Read the validation guide to choose the right
validation strategy.
Read the workflow guide for the full end-to-end
agency workflow.
Read the demo guide for more detail on the bundled
reference workflows.
Guides
Task-oriented workflow documentation for configuration, validation, demos, lifecycle, and troubleshooting.
Pages
Configuration guide — meridian-tools is driven by one YAML configuration file. This guide explains every section, its purpose, and its constraints. For a field-level schema reference, see yaml-schema.md.
Validation guide — This guide explains how to choose and configure validation strategies in meridian-tools. Validation is the process of evaluating a candidate model specification on held-out data before committing to a final production fit.
Model selection guide — This guide explains how meridian-tools supports Bayesian model selection using Leave-One-Out (LOO) cross-validation and the Watanabe-Akaike Information Criterion (WAIC). It covers when model selection is available, how to interpret the outputs, and how to compare multiple candidate models.
Lifecycle management guide — meridian-tools treats completed runs as immutable artefacts. The lifecycle module provides tools to load, compare, and refresh past runs without mutating them. This guide explains each lifecycle operation and when to use it.
Meridian Tools workflow guide — This guide shows the supported end-to-end agency workflow for meridian-tools. It starts with one YAML config, moves through candidate validation, separates the final full-sample fit from the validation runs, and ends with the artefacts you should hand over or inspect later. The examples in this guide stay inside the implemented package surface. They do not assume notebooks, dashboards, or unpublished helper scripts.
Meridian Tools demo guide — This is the canonical guide to the bundled meridian-tools demos. Use it when you want one safe, reproducible, end-to-end example without client data.
Troubleshooting — Common issues and solutions when working with meridian-tools.
Subsections of Guides
Configuration guide
meridian-tools is driven by one YAML configuration file. This guide explains
every section, its purpose, and its constraints. For a field-level schema
reference, see yaml-schema.md.
Configuration philosophy
The YAML file owns the authored project definition: project metadata, data
paths, model specification, fit settings, validation strategy, and export
switches. Runtime-only values — output_dir, run_name, and concrete
validation_spec — belong in PipelineRunConfig or CLI flags, not in the YAML
file. This separation ensures that the same YAML file can drive multiple runs
with different runtime options while remaining reproducible.
This is the smallest config that will pass validation. It uses defaults for
everything else: no validation, all exports enabled, no response curves, no
optimisation.
name — Human-readable project name. Used as the base for run directory
names unless overridden by --run-name at runtime.
data
CSV data loader configuration. Maps directly to Meridian’s CsvDataLoader.
data:path:./client_dataset.csvkpi_type:revenue # "revenue" (default) or "non-revenue"coord_to_columns:time:weekgeo:market # optional for national modelskpi:revenuepopulation:populationmedia:[impressions_tv, impressions_search]media_spend:[spend_tv, spend_search]controls:[promo_flag, price_index]media_to_channel:null# optional channel mapping overridesmedia_spend_to_channel:nullreach_to_channel:nullfrequency_to_channel:nullrf_spend_to_channel:nullorganic_reach_to_channel:nullorganic_frequency_to_channel:null
path — Path to the CSV data file. Relative paths are resolved against
the directory containing the YAML config file, not the current working
directory.
kpi_type — Either "revenue" or "non-revenue". Controls how Meridian
interprets the KPI column.
coord_to_columns — Maps Meridian coordinate names to CSV column names.
time is required. geo is optional (omit for national models).
model_spec
Raw keyword arguments forwarded to Meridian’s ModelSpec.
model_spec:kwargs:max_lag:8media_prior_type:roi
kwargs — Dictionary passed through to ModelSpec(**kwargs). Supports
any argument that Meridian’s ModelSpec accepts.
Special handling for holdout_id: if present in kwargs, the run is treated
as an “authored holdout” validation run. See the
validation guide for details.
fit
Sampling configuration for Meridian posterior fitting.
fit:sample_prior_draws:null# Optional prior-only samplingn_chains:4# Number of MCMC chainsn_adapt:500# Adaptation steps per chainn_burnin:500# Burn-in steps per chainn_keep:1000# Posterior samples to keep per chainseed:20260331# Reproducibility seed (int, list[int], or null)max_tree_depth:10# NUTS max tree depthmax_energy_diff:500.0# NUTS max energy differenceunrolled_leapfrog_steps:1# NUTS leapfrog stepsparallel_iterations:10# TF parallel iterations
All fields have sensible defaults. Override only what you need.
seed — Accepts a single integer, a list of integers (one per chain), or
null for non-deterministic sampling.
sample_prior_draws — If set, prior predictive samples are drawn before
posterior sampling. This is optional and primarily for model diagnostics.
validation
Validation and holdout orchestration settings. See the
validation guide for strategy selection advice.
# Option 1: No validation (default)validation:strategy:none# Option 2: Blocked tailvalidation:strategy:blocked_tailholdout_size:8# Option 3: Rolling originvalidation:strategy:rolling_origininitial_train_size:52test_size:4step_size:4# Must equal test_sizemax_splits:3# At least 2
strategy — One of "none", "blocked_tail", or "rolling_origin".
holdout_size — Required for blocked_tail. Number of time periods to
hold out from the end of the series.
initial_train_size, test_size — Required for rolling_origin.
step_size — Optional for rolling_origin. Must equal test_size if
set. Defaults to test_size.
max_splits — Optional for rolling_origin. Must be at least 2.
Validation rules:
blocked_tail rejects rolling-origin parameters.
rolling_origin rejects holdout_size.
none rejects all holdout and rolling-origin parameters.
Legacy holdout_size without explicit strategy is rejected.
exports
Output switches for diagnostics and model-selection artefacts.
exports:use_kpi:false# Use KPI-based metricsbatch_size:1000# Batch size for Meridian analysisexport_predictive_accuracy:true# Write predictive_accuracy.csvexport_review_summary:true# Write review_summary.jsonexport_model_selection:true# Write LOO/WAIC outputsexport_plots:true# Write PNG plot artefacts
All fields have defaults. If the entire exports section is omitted,
all exports are enabled with default settings.
response_curves
Optional. If omitted, the response curves stage is skipped.
spend_multipliers — Required. Non-empty list of non-negative floats.
confidence_level — Must be strictly between 0 and 1.
optimisation
Optional. If omitted, the optimisation stage is skipped.
optimisation:start_date:"2025-01-01"end_date:"2025-12-31"budget:mode:fixed_total # or "relative_reference_window_total"value:1000000.0use_posterior:trueuse_optimal_frequency:trueconfidence_level:0.9
start_date, end_date — ISO format YYYY-MM-DD. end_date must
be on or after start_date.
budget.mode — Either "fixed_total" (absolute budget) or
"relative_reference_window_total" (multiplier against the reference window’s
total spend).
budget.value — Positive float. For fixed_total, this is the absolute
budget. For relative_reference_window_total, this is a multiplier (e.g.
1.1 means 110% of the reference window total).
Validation strictness
All configuration models use Pydantic’s extra="forbid" mode. Any unexpected
key in the YAML file will produce a clear validation error. This prevents
silent misconfiguration from typos or outdated keys.
$ meridian-tools run --config bad.yml
# pydantic.ValidationError: 1 validation error for MeridianToolsConfig# exports -> export_pridictive_accuracy# Extra inputs are not permitted
Path resolution
Relative paths in data.path are resolved against the directory containing the
YAML config file, not the current working directory. This means:
# If config is at /workspace/configs/project.ymldata:path:../inputs/weekly.csv# Resolves to /workspace/inputs/weekly.csv
The resolved path is written to config.resolved.yaml in the run directory.
The original authored path is preserved in config.source.yaml.
Wrapper-owned preflight
Before meridian-tools creates a dated run directory, it performs one narrow
wrapper-owned preflight check on the authored config and the resolved input
CSV. Phase 10 keeps this boundary intentionally small so the wrapper does not
become a second Meridian schema layer.
The wrapper checks exactly:
the resolved data.path exists and is a regular file
the CSV header row can be read
the parsed header is non-empty
no parsed header cell is blank after trimming whitespace
every authored scalar entry in data.coord_to_columns exists in the header
every authored list member in data.coord_to_columns exists in the header
every authored key in data.media_to_channel exists in the header
every authored key in data.media_spend_to_channel exists in the header
every authored key in data.reach_to_channel exists in the header
every authored key in data.frequency_to_channel exists in the header
every authored key in data.rf_spend_to_channel exists in the header
every authored key in data.organic_reach_to_channel exists in the header
every authored key in data.organic_frequency_to_channel exists in the
header
authored list-valued coord families are non-empty
authored mapping fields above are non-empty
coord_to_columns.media and media_to_channel must be authored together
coord_to_columns.media_spend and media_spend_to_channel must be authored
together
coord_to_columns.reach, coord_to_columns.frequency,
reach_to_channel, and frequency_to_channel must be authored together
coord_to_columns.rf_spend and rf_spend_to_channel must be authored
together
coord_to_columns.organic_reach and organic_reach_to_channel must be
authored together
coord_to_columns.organic_frequency and
organic_frequency_to_channel must be authored together
Matching is exact and case-sensitive. The wrapper does not normalise headers,
apply aliases, or use fuzzy matching.
What remains Meridian-owned:
deep ModelSpec semantics
fit-dependent tensor or shape constraints
statistical validity checks that depend on model construction or sampling
So Phase 10 moves obvious wrapper-detectable mistakes earlier, but it does not
promise to catch everything Meridian may reject later.
This guide explains how to choose and configure validation strategies in
meridian-tools. Validation is the process of evaluating a candidate model
specification on held-out data before committing to a final production fit.
Why validation matters for MMM
Marketing Mix Models are fitted to time series data. Unlike standard supervised
learning, the temporal structure of the data means that naive IID
cross-validation (random train/test splits) is statistically inappropriate.
meridian-tools does not implement random shuffling or naive k-fold splits.
Instead, it provides two time-respecting validation strategies and a clear
separation between validation runs and the final production fit.
Validation strategies
none — No validation
validation:strategy:none
The model is fitted on the full dataset with no holdout. Use this when you
do not need candidate evaluation — for example, when rerunning a previously
validated specification.
blocked_tail — Single contiguous tail holdout
validation:strategy:blocked_tailholdout_size:8
Reserves the last holdout_size time periods as a test block. The model is
fitted on all preceding periods. This is the recommended default for short
MMM time series where you want one simple candidate evaluation.
When to use: Most standard MMM projects with fewer than 150 weekly
observations.
The holdout mask is generated automatically and injected into Meridian’s
holdout_id parameter. For geo-panel models, the mask is broadcast across
all geos.
Creates multiple expanding-window splits where each successive split adds
more training data. This provides a more robust evaluation signal than a
single blocked tail, but requires enough history to support multiple splits.
When to use: Projects with longer time series (typically 100+ weekly
observations) where you want multiple evaluation windows.
How it works:
Time axis: [t1, t2, ..., t52, t53, ..., t56, t57, ..., t60]
Split 1: Train [t1..t52], Test [t53..t56]
Split 2: Train [t1..t56], Test [t57..t60]
Constraints:
step_size must equal test_size (non-overlapping test windows).
max_splits must be at least 2.
initial_train_size + test_size must not exceed the number of observations.
The plan must yield at least two splits.
authored_holdout — User-provided holdout mask
This is not a YAML strategy setting. Instead, you provide holdout_id
directly in model_spec.kwargs:
When the runner detects an authored holdout_id in the YAML, it treats the
run as an authored_holdout validation run. The mask is passed through to
Meridian verbatim and recorded in the validation spec artefact.
When to use: When you need a specific holdout pattern that does not
follow blocked-tail or rolling-origin conventions.
CLI vs Python API
Blocked tail from the CLI
blocked_tail runs directly from the CLI because they produce one run:
meridian-tools run --config project.yml --output-dir runs
Rolling origin requires the Python API
rolling_origin is a Python-first planning surface because it produces
multiple runs — one per split plus a final fit. The CLI will reject direct
rolling_origin execution:
# This will fail:meridian-tools run --config project.yml # with strategy: rolling_origin# ValueError: cannot execute `rolling_origin` directly
Instead, use the Python API:
frompathlibimportPathimportpandasaspdfrommeridian_tools.configimportPipelineRunConfig,load_yaml_configfrommeridian_tools.cvimportbuild_validation_planfrommeridian_tools.runnerimportrun_pipelineconfig_path=Path("project.yml")config=load_yaml_config(config_path)# Read the time index from your datadata_path=config.data.pathifnotdata_path.is_absolute():data_path=(config_path.parent/data_path).resolve()frame=pd.read_csv(data_path)time_column=config.data.coord_to_columns["time"]geo_column=config.data.coord_to_columns.get("geo")time_index=frame[time_column].drop_duplicates().tolist()geo_index=Noneifgeo_columnisnotNone:geo_index=frame[geo_column].drop_duplicates().tolist()# Build the validation planvalidation_plan=build_validation_plan(config.validation,time_index=time_index,geo_index=geo_index,)# Execute each validation splitforrun_specinvalidation_plan.validation_runs:run_pipeline(PipelineRunConfig(config_path=config_path,output_dir=Path("runs"),validation_spec=run_spec,))
Separating validation from the final fit
Validation runs and the final production fit are different jobs. First you
evaluate candidate specifications on held-out splits. Then, once you have
chosen the specification, you run a separate full-sample fit with no holdout.
Do not reuse a validation fit as the production artefact. The validation fit
was trained on a subset of the data and its posterior reflects that subset.
Final fit after blocked tail
For blocked_tail, build_validation_plan provides a final_fit_run spec:
validation_plan=build_validation_plan(config.validation,time_index,geo_index)# Run the final fit on all datafinal_result=run_pipeline(PipelineRunConfig(config_path=config_path,output_dir=Path("runs"),validation_spec=validation_plan.final_fit_run,))
Final fit after rolling origin
The same pattern works for rolling origin:
# After running all validation splits...final_result=run_pipeline(PipelineRunConfig(config_path=config_path,output_dir=Path("runs"),validation_spec=validation_plan.final_fit_run,))
The final_fit_run spec has mode="final_fit", strategy="none", and
holdout_id=None. It trains on the full time axis with no holdout.
Run directory naming
The runner automatically appends a validation-aware suffix to the run name:
Scenario
Run name pattern
No validation
<project_name>_<timestamp>
Blocked tail
<project_name>_blocked_tail_<timestamp>
Rolling origin split 1
<project_name>_split_01_<timestamp>
Final fit
<project_name>_final_fit_<timestamp>
Authored holdout
<project_name>_authored_holdout_<timestamp>
Override the name with --run-name or PipelineRunConfig(run_name=...).
Validation spec artefact
Every validation-aware run writes a validation_spec.json artefact in the
10_validation/ stage directory. This JSON records:
mode — "validation" or "final_fit"
strategy — the validation strategy used
split_label — human-readable split identifier
holdout_source — "generated_validation", "authored_model_spec", or "none"
generated_holdout — whether the holdout mask was auto-generated
holdout_shape — shape of the holdout mask (without the actual data)
train_indices / test_indices — integer indices into the time axis
train_dates / test_dates — corresponding date values
The actual holdout mask is not stored in the JSON artefact (it can be large).
It is injected into the model at runtime.
Interaction with model selection
Bayesian model selection (LOO/WAIC) is only available for runs where
holdout_id is None — meaning full-sample fitted models and final-fit runs.
Validation fits and authored-holdout runs write a
model_selection_status.json artefact instead of LOO/WAIC outputs. See the
model selection guide for details.
Model selection guide
This guide explains how meridian-tools supports Bayesian model selection
using Leave-One-Out (LOO) cross-validation and the Watanabe-Akaike Information
Criterion (WAIC). It covers when model selection is available, how to interpret
the outputs, and how to compare multiple candidate models.
What model selection provides
Bayesian model selection uses information criteria computed from pointwise
log-likelihood values to compare model specifications. Unlike predictive
accuracy on a held-out set, LOO and WAIC evaluate the model’s expected
predictive performance using the full posterior without requiring a separate
validation split.
meridian-tools wraps ArviZ’s az.loo and az.waic with:
Automatic log-likelihood reconstruction for fitted Meridian models
Structured error handling when model selection is not possible
A compare_models surface for ranking multiple candidates
Artefact-level compatibility status in every run directory
Compatibility boundary
Model selection is only available for models where holdout_id is None.
This means:
Run type
Model selection available
Full-sample fit (no validation)
Yes
Final-fit run (mode: final_fit)
Yes
Blocked-tail validation run
No
Rolling-origin validation split
No
Authored-holdout run
No
Bare InferenceData without log_likelihood
No
This restriction exists because LOO and WAIC require the full observed
likelihood surface. A holdout fit has a modified likelihood that does not
represent the full data generating process. Comparing a holdout fit’s ELPD
against a full fit’s ELPD would be statistically meaningless.
How it works in the pipeline
When exports.export_model_selection: true in the YAML config, the runner’s
30_model_assessment stage attempts model selection after writing diagnostics.
Compatible runs
For compatible models, the stage writes:
loo_summary.json — LOO summary statistics (ELPD, p_loo, SE, etc.)
waic_summary.json — WAIC summary statistics
loo_pointwise.csv — Per-observation LOO values and Pareto k diagnostics
waic_pointwise.csv — Per-observation WAIC values
model_comparison.csv — Ranked comparison table (single-model for individual runs)
Incompatible runs
For incompatible models, the stage writes a single status artefact:
model_selection_status.json
{"status":"unavailable","reason_code":"holdout_fit_unsupported","reason":"Model selection requires holdout_id is None ..."}
Known reason codes:
Code
Meaning
holdout_fit_unsupported
The model was fitted with a holdout mask
requires_fitted_meridian_model
Missing posterior samples or ArviZ InferenceData
missing_log_likelihood_group
Bare InferenceData without reconstructable likelihood
meridian_internal_seam_incompatible
Meridian version lacks required internal reconstruction methods
Incompatibility is non-fatal. The pipeline completes successfully and
records the reason in the artefact.
Using the Python API directly
Compute LOO for a single model
frommeridian_tools.model_selectionimportcompute_looresult=compute_loo(fitted_model,pointwise=True)print(result.kind)# "loo"print(result.summary)# {"kind": "loo", "elpd_loo": -123.4, ...}print(result.pointwise)# DataFrame with loo_i, pareto_k per observation
Meridian does not store pointwise log-likelihood in its InferenceData by
default. meridian-tools reconstructs it automatically when you pass a
fitted Meridian model to compute_loo, compute_waic, or compare_models.
Rebuilds the joint distribution from the posterior samples
Computes observation-level log-likelihood
Returns a new InferenceData with the log_likelihood group attached
The original model is never mutated. The reconstruction produces a
temporary copy used only for the ArviZ computation.
You can also control this explicitly:
frommeridian_tools.log_likelihoodimportattach_log_likelihood# Returns new InferenceData with log_likelihood group (original unchanged)idata_with_ll=attach_log_likelihood(fitted_model,in_place=False)# Mutates the model's inference_data in placeattach_log_likelihood(fitted_model,in_place=True)
Interpreting the outputs
LOO summary
Field
Meaning
elpd_loo
Expected log pointwise predictive density (higher is better)
p_loo
Effective number of parameters
se
Standard error of elpd_loo
warning
Whether Pareto k diagnostics indicate unreliable estimates
WAIC summary
Field
Meaning
elpd_waic
Expected log pointwise predictive density (WAIC estimate)
The pointwise LOO output includes a pareto_k column. Values above 0.7
indicate that the LOO approximation is unreliable for those observations.
ArviZ will emit a warning if any Pareto k values exceed the threshold.
Model comparison
When comparing two or more models:
elpd_diff — Difference in ELPD from the best model (0 for the best)
dse — Standard error of the ELPD difference
weight — Stacking weight (how much to trust each model)
Models are ranked by ELPD (rank 0 is best)
A single-model comparison returns a one-row table with rank=0,
elpd_diff=0, and weight=1.0.
Error handling
All model-selection errors are raised as ModelSelectionError with a
structured reason_code:
frommeridian_tools.model_selectionimportModelSelectionError,compute_lootry:result=compute_loo(candidate)exceptModelSelectionErrorasexc:print(exc.reason_code)# e.g. "holdout_fit_unsupported"print(str(exc))# Human-readable explanation
In the pipeline, these errors are caught and written to
model_selection_status.json rather than failing the run.
Lifecycle management guide
meridian-tools treats completed runs as immutable artefacts. The lifecycle
module provides tools to load, compare, and refresh past runs without mutating
them. This guide explains each lifecycle operation and when to use it.
Core concepts
Run records
A RunRecord encapsulates a run’s metadata and artefact paths. It is loaded
from a run directory by reading run_manifest.json and resolving all artefact
paths against the directory.
frommeridian_tools.lifecycleimportload_run_recordrecord=load_run_record("runs/my-project_blocked_tail_20260402_073500")print(record.run_dir)# Path to the run directoryprint(record.manifest)# RunManifest with stages, timestamps, versionsprint(record.config_source_path)# Path to config.source.yamlprint(record.config_resolved_path)# Path to config.resolved.yamlprint(record.input_data_provenance_path)# Path to input_data_provenance.json (or None for older runs)print(record.diagnostics_bundle_path)# Path to diagnostics_bundle.json (or None)print(record.validation_spec_path)# Path to validation_spec.json (or None)print(record.model_selection_status_path)# Path to model_selection_status.json (or None)
All paths in the record are absolute. Required artefacts (config_source,
config_resolved) are validated at load time and always present.
input_data_provenance is also required for manifest version 3 runs.
Optional artefacts (diagnostics_bundle, validation_spec,
model_selection_status) are None if not present in the manifest.
Immutability
Lifecycle operations never modify a source run directory. When you refresh a
run, the output goes to a new sibling directory. When you compare runs, both
source directories remain untouched.
All lifecycle functions raise LifecycleError (a RuntimeError subclass)
when they encounter invalid state.
list_run_records discovers all direct child directories that contain a
run_manifest.json and returns them sorted by started_at timestamp
(most recent first), with run directory name as a secondary sort key.
The function requires a directory path (not a file). It will raise an error
if any discovered run directory contains an invalid manifest — it does not
silently skip broken runs.
Refreshing a run
Refreshing re-executes a run using its stored configuration but writes the
output to a new directory. The source run is never modified.
When to refresh
After a Meridian upgrade — to check whether the new version produces
comparable results with the same specification.
After a code change — to verify that refactoring did not change model
outputs.
After extending the dataset — to refit the model with additional
observations using the same validated specification.
If the source run was a validation run (blocked tail or rolling origin),
build_refresh_run_config reconstructs the validation spec from the stored
artefact, including the holdout mask geometry. For authored-holdout runs, it
reuses the YAML-owned holdout from the copied config.
For final-fit runs, the refresh produces another final-fit run with the same
full-sample training specification.
compare_run_records accepts run directory paths (not RunRecord objects)
and returns a pandas DataFrame with columns field, left, right,
status, and changed. The compared fields include:
run_name and status — basic identity.
meridian_tools_version and meridian_version — version drift.
has_validation_spec and has_diagnostics_bundle — artefact presence.
predictive_accuracy_status and review_summary_status — diagnostics.
has_model_selection_outputs and model_selection_reason_code — model selection.
input_authored_path, input_resolved_path, input_sha256,
input_size_bytes, input_mtime_utc, input_row_count,
input_column_count, and input_ordered_columns — dataset identity
and shape.
This is useful for auditing whether a refresh or a specification change
produced materially different results.
If either run predates manifest version 3, provenance rows are reported with
status == "legacy_unknown" and changed == None. That distinguishes
“no stored provenance exists” from “the dataset definitely changed”.
Lifecycle workflow example
A typical lifecycle workflow for a quarterly model refresh:
frompathlibimportPathfrommeridian_tools.lifecycleimport(load_run_record,list_run_records,build_refresh_run_config,)frommeridian_tools.runnerimportrun_pipeline# 1. Find the most recent production runrecords=list_run_records("runs/")production_run=records[0]# Most recent by started_at# 2. Refresh with the updated datasetrefresh_config=build_refresh_run_config(production_run.run_dir,output_dir=Path("runs/quarterly-refresh"),)refresh_result=run_pipeline(refresh_config)# 3. Compare the resultscomparison=compare_run_records(production_run.run_dir,refresh_result.run_dir)print(comparison)
Manifest versioning
The lifecycle layer supports manifest versions 0, 1, 2, and 3. Older
manifests are handled gracefully with default values for fields that were
added in later versions. The current version is 3.
This means you can load run directories created by earlier versions of
meridian-tools without issues. The loaded RunRecord keeps the same shape,
but input_data_provenance_path is None for pre-v3 runs because those
manifests predate provenance capture.
Meridian Tools workflow guide
This guide shows the supported end-to-end agency workflow for
meridian-tools. It starts with one YAML config, moves through candidate
validation, separates the final full-sample fit from the validation runs, and
ends with the artefacts you should hand over or inspect later. The examples in
this guide stay inside the implemented package surface. They do not assume
notebooks, dashboards, or unpublished helper scripts.
Before you start
Install Meridian first, then install meridian-tools in the same environment:
Use the CLI for ordinary run execution. Use the Python API when you need
rolling-origin planning, an explicit final-fit run, or lifecycle compare and
refresh operations. Phase 07 does not provide a lifecycle CLI.
If you want packaged reference examples before authoring your own YAML, use the
bundled demo guide in demos.md. The packaged demo launcher is
meridian-tools demo .... The repo-root python runme.py ... wrapper remains
available when you are working from a source checkout.
Author one YAML config
Keep the authored project definition in YAML. Keep runtime-only choices out of
the YAML file. In practice, that means your source file owns the project
metadata, data path, model specification, fit settings, validation settings,
and export switches. Runtime-only values such as output_dir, run_name, and
one concrete validation_spec belong in PipelineRunConfig or the CLI call,
not in config.resolved.yaml.
Use blocked_tail when you want one contiguous future block for candidate
evaluation. This is often the right default for short MMM time series. Use
rolling_origin when you have enough history to evaluate more than one
expanding-window split. Do not treat rolling_origin as ordinary k-fold
cross-validation. The package does not implement naive IID folds or random
shuffling because that is not the right statistical workflow for MMM time
series.
Validation runs and the final production fit are different jobs. First, you
evaluate candidate specifications on blocked time splits. Then, once you have
chosen the specification, you run a separate full-sample fit with no holdout.
Run one blocked-tail candidate from the CLI
Once the YAML file is authored, you can execute a blocked-tail candidate run
directly through the CLI:
meridian-tools run --config project.yml --output-dir runs
The same packaged runner surface is available through the thin repo-root
wrapper:
python runme.py run --config project.yml --output-dir runs
This command creates a dated run directory under runs/. If you need to
change the output location or the visible run name, pass --output-dir or
--run-name at execution time. Those are runtime-only overrides. They affect
the run directory and manifest, but they do not become part of the authored
YAML contract.
Plan and run rolling-origin validation through the Python API
rolling_origin is a Python-first planning surface because you need one
concrete split at a time. Start with an explicit YAML definition:
For rolling_origin and blocked_tail workflows, validation_plan.final_fit_run
is the explicit no-holdout runtime spec. It keeps the boundary clear. Candidate
validation and final production fitting are separate steps.
Know which artefacts matter for handoff
Each successful run directory is the handoff unit. The important files are:
run_manifest.json for stage status, versions, timestamps, and top-level
artefact links
00_run_metadata/config.source.yaml for the authored source config
00_run_metadata/config.resolved.yaml for the YAML-owned config after path
resolution
00_run_metadata/input_data_provenance.json for the exact dataset identity
used by the run
10_validation/validation_spec.json when the run is validation-aware
30_model_assessment/diagnostics_bundle.json for stable diagnostics
metadata
30_model_assessment/model_results_summary.html for the wrapped Meridian
assessment summary
30_model_assessment/plots/ for assessment PNG plots such as model fit and
rhat review
40_decomposition/summary_metrics.csv and summary_metrics.nc for
decomposition exports
40_decomposition/plots/ for decomposition PNG plots
60_response_curves/plots/response_curves_plot.png when response-curve
export is enabled
70_optimisation/plots/ when optimisation export is enabled
30_model_assessment model-selection outputs when the run is compatible, or
30_model_assessment/model_selection_status.json when it is not
Read those artefacts together.
30_model_assessment/diagnostics_bundle.json tells you whether predictive
accuracy and review summary were exported or disabled. The assessment stage
either contains the real Bayesian model-selection outputs or one explicit
compatibility status artefact.
The supported Bayesian model-selection boundary is narrow and deliberate. The
package supports fitted Meridian models where holdout_id is None. That means
full-sample fitted models and explicit final-fit runs are compatible. Validation
fits and authored holdout fits are not.
Use lifecycle helpers after a run exists
Once you have stored run directories, the lifecycle API lets you reload,
compare, and refresh them without going back to notebook state.
compare_run_records(...) gives you a metadata-level comparison. It does not
attempt a raw-file diff across every output. refresh_run(...) rebuilds a new
sibling run from the stored run-local artefacts. It does not overwrite the
source run. Phase 07 does not provide lifecycle CLI commands, so use the
Python API for these operations.
For the bundled reference examples and the exact stage-level file set, see
demos.md.
A practical analyst sequence
If you want one concrete operating pattern, use this one. Author a YAML file.
Run a blocked-tail candidate through the CLI when you need one held-out tail
block. Use rolling_origin through build_validation_plan(...) when you need
multiple expanding-window validation splits. Choose the modelling
specification. Run the final full-sample fit as its own job. Review the run
directory artefacts. Then use compare_run_records(...) and refresh_run(...)
when you need to inspect or rerun stored work later.
Meridian Tools demo guide
This is the canonical guide to the bundled meridian-tools demos. Use it when
you want one safe, reproducible, end-to-end example without client data.
The public story is simple:
Meridian is the modelling engine.
meridian-tools is the workflow wrapper.
The bundled demos are launched through meridian-tools surfaces, not by
calling Meridian directly.
What the bundled demos are for
Phase 08 adds two bundled reference workflows:
timeseries
a national timeseries demo shipped as packaged demo data
geo_panel
a geo-panel demo shipped as packaged demo data
Both datasets are bundled non-client reference data. They exist so analysts and
stakeholders can inspect the workflow, run structure, and review artefacts
without using client material.
What the package adds on top of Meridian
Meridian remains responsible for the modelling and analysis primitives.
meridian-tools adds the operational surface that agencies usually need around
it:
typed YAML configuration
blocked-tail and rolling-origin validation workflow
a thin demo launcher for bundled reference workflows
This is why the demos are useful. They show the wrapper workflow directly,
rather than asking users to reconstruct it from notebooks or internal scripts.
Demo entrypoints
List the supported demos:
meridian-tools demo --list
Run the bundled timeseries demo:
meridian-tools demo timeseries
Run the bundled geo-panel demo:
meridian-tools demo geo_panel
By default, demo runs are written under runs/demos/. If you want a different
root, pass --output-dir. If you want a custom visible run name, pass
--run-name.
The same package can also run an explicit authored config:
meridian-tools run --config /path/to/project.yml --output-dir runs
The repo-root wrapper can run an explicit authored config too:
python runme.py run --config /path/to/project.yml --output-dir runs
Bundled YAML surface
The bundled demo YAML files are real meridian-tools configs. They are not
legacy Abacus-style placeholders.
The authored sections used in Phase 08 are:
project
data
model_spec
fit
validation
exports
response_curves
optimisation
The Phase 08 additions are:
response_curves
required if you want the response-curve export stage to run
optimisation
required if you want the optimisation export stage to run
The bundled demos include both sections so that the full staged schema is
exercised.
The default demo configs use validation.strategy: none. That keeps the
reference runs model-selection compatible, so LOO and WAIC outputs are
written by default.
Output schema
Each successful demo run writes one manifest-backed staged directory layout:
run identity, versions, timestamps, stage status, and top-level artefact
links
00_run_metadata/config.source.yaml
the authored YAML
00_run_metadata/config.resolved.yaml
the same YAML after runtime path resolution
10_validation/validation_spec.json
validation provenance for validation-aware runs only
not present in the default bundled demos because they run as full-sample fits
30_model_assessment/diagnostics_bundle.json
the stable machine-readable record of diagnostics export state
30_model_assessment/model_results_summary.html
the wrapped Meridian assessment summary
40_decomposition/summary_metrics.csv
the easiest tabular decomposition output to inspect first
For model selection, keep the boundary honest:
LOO and WAIC are only available for compatible fitted Meridian models
validation fits and other incompatible cases will record
model_selection_status.json instead
the package does not pretend unsupported runs have valid Bayesian comparison
outputs
the bundled demos are configured as full-sample fits, so they should write
loo_summary.json and waic_summary.json by default
For response curves and optimisation:
these outputs are useful for scenario and allocation review
they are not a substitute for checking diagnostics, validation provenance, or
model-selection compatibility first
For visual review, each stage now keeps its PNG exports inside a local plots/
subdirectory rather than mixing image files into the stage root. That keeps the
machine-readable exports and the human-review plots in one predictable place.
inspect 60_response_curves/ and 70_optimisation/ if those stages ran
If you are working from a source checkout, python runme.py demo --list and
python runme.py demo ... remain equivalent convenience wrappers.
That sequence shows the wrapper value quickly: one YAML config in, one
structured run directory out, with the Meridian and meridian-tools artefacts
kept in one predictable place.
Troubleshooting
Common issues and solutions when working with meridian-tools.
Installation issues
meridian-tools --help fails with ImportError
Cause: The package is not installed in the active environment, or Meridian
is missing.
Fix:
pip install -e ".[dev]"
If Meridian is not installed:
pip install "google-meridian[schema]==1.5.3"
RuntimeError: Saving meridian_model.binpb requires Meridian schema support
Cause: Meridian was installed without the [schema] extra.
Cause: A required wrapper dependency check failed before config/data
preflight or run-directory creation.
Common triggers:
google-meridian[schema] support is unavailable
exports.export_plots: true is set but vl-convert-python PNG support is
unavailable
Fix: Install or repair the missing runtime dependency first, then rerun.
ConfigPreflightError
Cause:meridian-tools found a wrapper-owned config or input-data issue
before run-directory creation.
Common triggers:
data.path resolves to a missing file or a directory
the CSV header row cannot be read
the header is empty or contains blank cells
an authored column name does not appear in the header exactly
a supported media/RF family is only half-authored
Fix: Correct the authored YAML or the input CSV first, then rerun. Header
matching is exact and case-sensitive in Phase 10.
ValidationExecutionContractError
Cause: The requested single-run execution path is incompatible with the
authored validation setup.
Common triggers:
you tried to run a rolling_origin config directly from the CLI or
run_pipeline(...)
you passed PipelineRunConfig.validation_spec while the YAML already
authors model_spec.kwargs.holdout_id
Fix: For rolling_origin, build a validation plan and execute one concrete
split at a time through the Python API. For authored holdouts, either keep the
YAML-authored holdout_id path or remove it before supplying a runtime
validation_spec. See the validation guide for the full
workflow.
ModelSelectionError with reason_code: holdout_fit_unsupported
Cause: LOO/WAIC was requested for a model fitted with a holdout mask.
Not a bug. Model selection is only available for full-sample fits. The
pipeline records the incompatibility in model_selection_status.json and
continues. See the model selection guide.
ModelSelectionError with reason_code: meridian_internal_seam_incompatible
Cause: The installed Meridian version does not expose the internal
reconstruction methods needed for log-likelihood computation.
Fix: Check the Meridian version. This package requires google-meridian[schema]==1.5.3.
If you recently upgraded Meridian, the private reconstruction seams may have
changed. Check the Meridian integration notes.
Run fails mid-pipeline
If a run fails after the dated run directory already exists, meridian-tools
raises PipelineRunFailure. The CLI and runme.py print the concrete failed
run directory, manifest path, and stage name when available.
The original exception is preserved as __cause__, so --traceback still
shows the underlying failure.
The manifest is written to disk after each stage. If a run fails, the
run_manifest.json is left on disk and marked failed. You can inspect it to
determine which stage failed:
Look at the stages array. A failed stage is recorded with status: "failed" and an error message.
Validation errors
time_index must be strictly increasing with no duplicate values
Cause: The time column in your data contains duplicates or is not sorted.
Fix: Ensure your CSV data has unique, monotonically increasing time values.
For geo-panel data, the time column should be unique per time period (not per
geo × time combination — the function expects the deduplicated time axis).
rolling_origin must yield at least two splits
Cause: The combination of initial_train_size, test_size, and data length
does not produce enough splits.
Fix: Either reduce initial_train_size, reduce test_size, or use
blocked_tail instead for shorter series.
holdout_size must be smaller than the time axis
Cause: The holdout size is greater than or equal to the number of time
periods.
Fix: Reduce holdout_size to leave at least one training period.
Lifecycle errors
LifecycleError when loading a run record
Cause: The run manifest is missing required entries, references a file that
does not exist, or has a malformed JSON structure.
Fix: Check that the run directory was not manually modified. Required
artefacts are config.source.yaml and config.resolved.yaml.
diagnostics_bundle.json is optional for loading but required for new runs.
Path traversal rejection
Cause: An artefact path in the manifest resolves outside the run directory.
Not fixable by editing the manifest. This is a security check. The manifest
was likely corrupted or manually edited with an invalid path.
Performance issues
Pipeline takes very long
MCMC sampling (the 20_model_fit stage) dominates wall-clock time. The
meridian-tools orchestration layer adds negligible overhead.
For production runs, use the defaults or increase these values for better
posterior quality.
Out-of-memory during model selection
Log-likelihood reconstruction loads the full posterior into memory and creates
a temporary copy of the InferenceData. For large models, this can double
memory usage temporarily.
Mitigation: Reduce n_keep or n_chains if memory is constrained.
Warnings
ArviZ Pareto k warnings
Estimated shape parameter of Pareto distribution is greater than 0.7 ...
This means the LOO approximation is unreliable for some observations. Check
the pointwise pareto_k values in loo_pointwise.csv. Values above 0.7
indicate influential observations.
Meridian national model auto-zeroing warnings
Hierarchical distribution parameters must be deterministically zero for
national models. eta_orf has been automatically set to Deterministic(0).
This is expected for national (non-geo) models. Meridian automatically zeros
out geo-level hierarchical parameters. The warning is informational.
TensorFlow deprecation warnings
These come from TensorFlow and Meridian internals. meridian-tools groups and
deduplicates them in the terminal output to reduce noise. They do not indicate
a problem with your run.
Reference
Lookup documentation for the CLI, YAML schema, manifest schema, output layout, and related contracts.
Pages
CLI reference — meridian-tools provides a command-line interface with two subcommands: run and demo.
Manifest schema reference — The run_manifest.json file is the source of truth for every meridian-tools run. It lives at the root of the run directory and records identity, timing, versions, overall status, top-level artefact index, and per-stage records.
Output schema reference — This page documents the complete run directory layout produced by meridian-tools. Every successful pipeline run creates a timestamped directory containing the artefacts described below.
Validation spec schema reference — The validation_spec.json artefact is written to 10_validation/ for every validation-aware pipeline run. It records the concrete validation provenance for that specific run, including the holdout strategy, split geometry, and date windows.
Subsections of Reference
CLI reference
meridian-tools provides a command-line interface with two subcommands: run
and demo.
Global usage
meridian-tools <subcommand> [options]
meridian-tools run
Execute a meridian-tools pipeline run from an authored YAML config.
meridian-tools run --config <path> [--output-dir <dir>][--run-name <name>][--traceback]
Arguments
Argument
Required
Default
Description
--config
Yes
—
Path to the meridian-tools YAML configuration file.
--output-dir
No
runs
Directory where dated run folders will be created.
--run-name
No
project.name from YAML
Optional run name override.
--traceback
No
false
Show the full Python traceback on failure.
Examples
# Basic runmeridian-tools run --config project.yml
# Custom output directorymeridian-tools run --config project.yml --output-dir output/model_runs
# Named run with traceback on failuremeridian-tools run --config project.yml --run-name client-q1-review --traceback
Exit codes
Code
Meaning
0
Pipeline completed successfully.
1
Pipeline failed. Error details are printed to stderr. Use --traceback for the full stack trace.
Failure reporting
The CLI distinguishes five broad failure classes:
config loading or Pydantic validation failures before wrapper preflight
dependency preflight failures before run-directory creation
validation-execution contract failures before run-directory creation
wrapper-owned ConfigPreflightError failures before run-directory creation
PipelineRunFailure after the dated run directory already exists
Dependency preflight covers google-meridian[schema] support and optional
plot-export support. Validation-execution contract failures cover incompatible
single-run validation requests such as direct rolling_origin execution.
Wrapper preflight covers only the closed config/data matrix documented in the
configuration guide.
For PipelineRunFailure, the CLI prints the concrete failed run directory,
manifest path, and stage name when available so the partial run can be
inspected immediately. --traceback still shows the original underlying
exception because it is preserved through __cause__.
Validation strategy restrictions
The CLI executes a single pipeline run. Configs with
validation.strategy: rolling_origin cannot be run directly from the CLI
because they require multiple sequential runs. Use the
Python API for rolling-origin workflows.
Configs with strategy: none or strategy: blocked_tail work directly from
the CLI.
meridian-tools demo
Run one of the bundled reference demos or list available demos.
Bundled demo name to execute. One of: timeseries, geo_panel.
--list
No
false
List supported demos and exit. Cannot combine with a demo name.
--output-dir
No
runs/demos/ (source checkout) or ./runs/demos/ (installed)
Override the output root directory.
--run-name
No
None (uses project.name from the demo config)
Optional run name override.
--traceback
No
false
Show the full Python traceback on failure.
Examples
# List available demosmeridian-tools demo --list
# Run the timeseries demomeridian-tools demo timeseries
# Run with a custom output directorymeridian-tools demo geo_panel --output-dir sandbox/demo-output
# Run with a custom namemeridian-tools demo timeseries --run-name demo-review-q2
Available demos
Name
Description
timeseries
National timeseries demo using bundled reference data.
geo_panel
Geo-panel demo using bundled reference data.
Both demos exercise the full staged pipeline including response curves and
optimisation.
Lightweight import
The CLI is designed for fast startup. Running meridian-tools --help or
meridian-tools demo --list does not import TensorFlow, NumPy, Meridian,
or ArviZ. Heavy imports are deferred until pipeline execution begins.
Entrypoints
The primary CLI entrypoint is the console script registered in
pyproject.toml:
Keyword arguments forwarded directly to Meridian ModelSpec(**kwargs).
Supported kwargs keys include any argument accepted by Meridian’s ModelSpec
constructor: max_lag, media_prior_type, holdout_id, etc. If holdout_id
is present, the run is treated as an authored-holdout validation run.
Array-valued keys (holdout_id, control_population_scaling_id,
non_media_population_scaling_id, rf_roi_calibration_period,
roi_calibration_period) are converted to NumPy arrays at runtime.
fit
Field
Type
Default
Constraint
Description
sample_prior_draws
PositiveInt | null
null
>0 if set
Number of prior predictive draws. null skips prior sampling.
n_chains
PositiveInt | list[PositiveInt]
4
>0
Number of MCMC chains.
n_adapt
PositiveInt
500
>0
Adaptation steps per chain.
n_burnin
PositiveInt
500
>0
Burn-in steps per chain.
n_keep
PositiveInt
1000
>0
Posterior samples to retain per chain.
seed
int | list[int] | null
null
—
RNG seed for reproducibility.
max_tree_depth
PositiveInt
10
>0
NUTS maximum tree depth.
max_energy_diff
float
500.0
—
NUTS maximum energy difference.
unrolled_leapfrog_steps
PositiveInt
1
>0
NUTS unrolled leapfrog steps.
parallel_iterations
PositiveInt
10
>0
TensorFlow parallel iterations.
validation
Field
Type
Default
Constraint
Description
strategy
"none" | "blocked_tail" | "rolling_origin"
"none"
—
Validation strategy.
holdout_size
PositiveInt | null
null
Required for blocked_tail
Number of tail time periods to hold out.
initial_train_size
PositiveInt | null
null
Required for rolling_origin
Initial training window size.
test_size
PositiveInt | null
null
Required for rolling_origin
Test window size per split.
step_size
PositiveInt | null
null
Must equal test_size
Step between rolling splits. Defaults to test_size.
max_splits
PositiveInt | null
null
>=2 if set
Maximum number of rolling splits.
Cross-field validation rules
strategy: none rejects all holdout and rolling-origin parameters.
strategy: rolling_origin requires initial_train_size and test_size, rejects holdout_size.
holdout_size without an explicit strategy is rejected (legacy shorthand removed).
Rolling-origin parameters without strategy: rolling_origin are rejected.
exports
Field
Type
Default
Description
use_kpi
bool
false
Use KPI-based metrics in Meridian analysis surfaces.
batch_size
PositiveInt
1000
Batch size for Meridian Analyzer computations.
export_predictive_accuracy
bool
true
Write predictive_accuracy.csv.
export_review_summary
bool
true
Write review_summary.json.
export_model_selection
bool
true
Write LOO/WAIC outputs (when compatible).
export_plots
bool
true
Write PNG plot artefacts in each stage.
response_curves
Optional section. If omitted or null, the response curves stage is skipped.
Field
Type
Default
Constraint
Description
spend_multipliers
list[float]
required
Non-empty, all >=0
Spend multiplier grid for response curve computation.
use_posterior
bool
true
—
Use posterior (vs prior) for response curves.
by_reach
bool
true
—
Compute reach-based response curves.
use_optimal_frequency
bool
false
—
Use optimal frequency in computation.
confidence_level
float
0.9
0 < x < 1
Confidence level for credible intervals.
optimisation
Optional section. If omitted or null, the optimisation stage is skipped.
Field
Type
Default
Constraint
Description
start_date
str
required
ISO YYYY-MM-DD
Start of the optimisation window.
end_date
str
required
ISO YYYY-MM-DD, >= start_date
End of the optimisation window.
budget
OptimisationBudgetConfig
required
—
Budget specification (see below).
use_posterior
bool
true
—
Use posterior (vs prior) for optimisation.
use_optimal_frequency
bool
true
—
Use optimal frequency in optimisation.
confidence_level
float
0.9
0 < x < 1
Confidence level for credible intervals.
optimisation.budget
Field
Type
Default
Constraint
Description
mode
"fixed_total" | "relative_reference_window_total"
required
—
Budget mode.
value
PositiveFloat
required
>0
Budget value. Absolute for fixed_total, multiplier for relative_reference_window_total.
When mode: relative_reference_window_total, the effective budget is
value × total_spend_in_reference_window. The reference window is defined by
start_date and end_date.
Manifest schema reference
The run_manifest.json file is the source of truth for every
meridian-tools run. It lives at the root of the run directory and records
identity, timing, versions, overall status, top-level artefact index, and
per-stage records.
Current version
The current manifest version is 3. Versions 0, 1, and 2 are supported for
backward compatibility when loading older run directories.
Top-level fields
Field
Type
Description
manifest_version
int
Schema version (0, 1, 2, or 3).
run_name
str
Human-readable run name.
config_path
str
Path to the source YAML used for this run. For refresh runs this points to the source run’s archived config.source.yaml.
output_dir
str
Path to the run directory.
started_at
str
UTC ISO-8601 timestamp when the run began.
status
str
Overall run status: "running", "completed", or "failed".
finished_at
str | null
UTC ISO-8601 timestamp when the run finished. null while running.
meridian_tools_version
str
Version of meridian-tools that produced the run.
meridian_version
str | null
Version of Google Meridian used. null if not yet recorded.
artifacts
dict[str, str]
Top-level artefact index. Key artefacts from stages are promoted here for quick lookup.
stages
list[StageRecord]
Ordered list of pipeline stage records (including skipped and failed stages).
Top-level artifacts index
The runner promotes key artefacts into the top-level artifacts dictionary
after each stage completes. Promoted artefact names include:
This index provides flat access to important artefacts without walking the
stages array.
StageRecord fields
Each entry in the stages array represents one pipeline stage. Stages can
have any of four statuses: "running", "completed", "skipped", or
"failed".
Field
Type
Description
name
str
Stage identifier (for example, "00_run_metadata", "20_model_fit").
status
str
Stage status: "running", "completed", "skipped", or "failed".
started_at
str | null
UTC ISO-8601 timestamp when the stage began.
finished_at
str | null
UTC ISO-8601 timestamp when the stage finished.
elapsed_seconds
float | null
Wall-clock seconds for stage execution.
message
str | null
Human-readable message. Present for skipped stages (reason) and failed stages (error).
artifacts
dict[str, str]
Map of artefact names to relative file paths. Empty for skipped stages.
Artefact path convention
All artefact paths in the manifest are relative to the run directory. This
makes run directories portable across machines and file systems. When you load
a run record through load_run_record, the lifecycle layer resolves relative
paths to absolute paths against the run directory.
All seven stages are always recorded in execution order. Stages that do not
apply to a given run are recorded with status: "skipped".
Stage name
Number
Skippable
Description
00_run_metadata
00
No
Config archival and input-data provenance capture.
10_validation
10
Yes
Validation spec (skipped when no validation applies).
20_model_fit
20
No
Meridian model fitting.
30_model_assessment
30
No
Diagnostics, model selection.
40_decomposition
40
No
Media decomposition metrics.
60_response_curves
60
Yes
Response curves (skipped when the config section is absent).
70_optimisation
70
Yes
Budget optimisation (skipped when the config section is absent).
The numbering gap at 50 is intentional, reserving space for future stages.
Required artefacts
The lifecycle layer requires the following top-level artefacts to be present
in the manifest for a run to be loadable:
config_source (promoted from 00_run_metadata)
config_resolved (promoted from 00_run_metadata)
input_data_provenance (promoted from 00_run_metadata) for manifest
version 3 runs
These are enforced by _require_manifest_artifact in load_run_record. If a
required entry is missing, a LifecycleError is raised.
The diagnostics_bundle artefact is treated as optional by the lifecycle
loader. If it is absent from the manifest, RunRecord.diagnostics_bundle_path
is None. However, diagnostics_bundle is listed in
REQUIRED_MANIFEST_ARTIFACTS and validated at run completion time — so new
runs always produce it, but older or partial runs can still be loaded without
it.
Input-data provenance payload
Manifest version 3 introduces 00_run_metadata/input_data_provenance.json.
This file records the pinned Phase 09 input-data contract:
provenance_version
authored_path
resolved_path
sha256
size_bytes
mtime_utc
row_count
column_count
ordered_columns
The lifecycle compare surface uses these fields to distinguish real dataset
changes from older runs whose manifests predate provenance capture.
Example manifest
{"manifest_version":3,"run_name":"my-project_blocked_tail","config_path":"/workspace/configs/project.yml","output_dir":"/workspace/runs/my-project_blocked_tail_20260402_073500","started_at":"2026-04-02T07:35:00+00:00","status":"completed","finished_at":"2026-04-02T07:42:15+00:00","meridian_tools_version":"0.3.0","meridian_version":"1.5.3","artifacts":{"config_source":"00_run_metadata/config.source.yaml","config_resolved":"00_run_metadata/config.resolved.yaml","input_data_provenance":"00_run_metadata/input_data_provenance.json","validation_spec":"10_validation/validation_spec.json","meridian_model":"20_model_fit/meridian_model.binpb","diagnostics_bundle":"30_model_assessment/diagnostics_bundle.json","model_results_summary":"30_model_assessment/model_results_summary.html","summary_metrics_csv":"40_decomposition/summary_metrics.csv","summary_metrics_nc":"40_decomposition/summary_metrics.nc"},"stages":[{"name":"00_run_metadata","status":"completed","started_at":"2026-04-02T07:35:00+00:00","finished_at":"2026-04-02T07:35:01+00:00","elapsed_seconds":0.5,"message":null,"artifacts":{"config_source":"00_run_metadata/config.source.yaml","config_resolved":"00_run_metadata/config.resolved.yaml","input_data_provenance":"00_run_metadata/input_data_provenance.json"}},{"name":"10_validation","status":"completed","started_at":"2026-04-02T07:35:01+00:00","finished_at":"2026-04-02T07:35:01+00:00","elapsed_seconds":0.1,"message":null,"artifacts":{"validation_spec":"10_validation/validation_spec.json"}},{"name":"20_model_fit","status":"completed","started_at":"2026-04-02T07:35:01+00:00","finished_at":"2026-04-02T07:40:30+00:00","elapsed_seconds":329.0,"message":null,"artifacts":{"meridian_model":"20_model_fit/meridian_model.binpb","fit_metadata":"20_model_fit/fit_metadata.json"}},{"name":"30_model_assessment","status":"completed","started_at":"2026-04-02T07:40:30+00:00","finished_at":"2026-04-02T07:41:00+00:00","elapsed_seconds":30.1,"message":null,"artifacts":{"diagnostics_bundle":"30_model_assessment/diagnostics_bundle.json","review_summary":"30_model_assessment/review_summary.json","model_results_summary":"30_model_assessment/model_results_summary.html","model_selection_status":"30_model_assessment/model_selection_status.json"}},{"name":"40_decomposition","status":"completed","started_at":"2026-04-02T07:41:00+00:00","finished_at":"2026-04-02T07:42:00+00:00","elapsed_seconds":60.0,"message":null,"artifacts":{"summary_metrics_nc":"40_decomposition/summary_metrics.nc","summary_metrics_csv":"40_decomposition/summary_metrics.csv"}},{"name":"60_response_curves","status":"skipped","started_at":"2026-04-02T07:42:00+00:00","finished_at":"2026-04-02T07:42:00+00:00","elapsed_seconds":0.0,"message":"No `response_curves` section was authored in the YAML config.","artifacts":{}},{"name":"70_optimisation","status":"skipped","started_at":"2026-04-02T07:42:00+00:00","finished_at":"2026-04-02T07:42:00+00:00","elapsed_seconds":0.0,"message":"No `optimisation` section was authored in the YAML config.","artifacts":{}}]}
Version history
Version 3 (current)
Added input_data_provenance.json and made provenance available to lifecycle
loading and compare surfaces.
Version 2
Added export_plots support, top-level artifacts index, status field,
config_path, output_dir, and per-stage status, elapsed_seconds, and
message fields.
Version 1
Added meridian_version field and response_curves / optimisation stages.
Version 0
Initial manifest schema with core stages and artefact tracking.
All four versions are supported by RunManifest.from_dict. Missing fields in
older versions are filled with defaults.
Output schema reference
This page documents the complete run directory layout produced by
meridian-tools. Every successful pipeline run creates a timestamped
directory containing the artefacts described below.
Run directory structure
<run_name>_<YYYYMMDD_HHMMSS>/
│
├── run_manifest.json # Source of truth for the run
│
├── 00_run_metadata/
│ ├── config.source.yaml # Verbatim copy of the authored YAML
│ ├── config.resolved.yaml # YAML after path resolution
│ └── input_data_provenance.json # Pinned source/resolution/hash metadata
│
├── 10_validation/ # Only for validation-aware runs
│ └── validation_spec.json # Validation provenance record
│
├── 20_model_fit/
│ ├── meridian_model.binpb # Serialised Meridian model
│ └── fit_metadata.json # Fit settings and Meridian version
│
├── 30_model_assessment/
│ ├── diagnostics_bundle.json # Diagnostics export manifest
│ ├── predictive_accuracy.csv # Per-observation accuracy metrics
│ ├── review_summary.json # Meridian review battery results
│ ├── model_results_summary.html # Meridian HTML summary report
│ ├── plots/ # When export_plots: true
│ │ ├── model_fit.png
│ │ └── rhat_boxplot.png
│ │
│ │ # Model selection outputs (compatible runs):
│ ├── loo_summary.json # LOO summary statistics
│ ├── waic_summary.json # WAIC summary statistics
│ ├── loo_pointwise.csv # Per-observation LOO + Pareto k
│ ├── waic_pointwise.csv # Per-observation WAIC
│ └── model_comparison.csv # Ranked comparison table
│ │
│ │ # Model selection status (incompatible runs):
│ └── model_selection_status.json # Reason code for unavailability
│
├── 40_decomposition/
│ ├── summary_metrics.nc # NetCDF decomposition dataset
│ ├── summary_metrics.csv # Tabular decomposition
│ └── plots/ # When export_plots: true
│ ├── channel_contribution_area_chart.png
│ ├── contribution_waterfall_chart.png
│ ├── spend_vs_contribution_chart.png
│ └── roi_bar_chart.png
│
├── 60_response_curves/ # Only when response_curves configured
│ ├── response_curves.nc # NetCDF response curve dataset
│ ├── response_curves.csv # Tabular response curves
│ └── plots/ # When export_plots: true
│ └── response_curves_plot.png
│
└── 70_optimisation/ # Only when optimisation configured
├── optimisation_summary.html # Meridian optimisation HTML report
├── optimised_data.nc # Optimised allocation (NetCDF)
├── optimised_data.csv # Optimised allocation (CSV)
├── nonoptimised_data.nc # Baseline allocation (NetCDF)
├── nonoptimised_data.csv # Baseline allocation (CSV)
├── optimisation_grid.csv # Full optimisation grid
└── plots/ # When export_plots: true
├── incremental_outcome_delta_plot.png
├── budget_allocation_optimised_plot.png
├── budget_allocation_nonoptimised_plot.png
├── spend_delta_plot.png
└── optimisation_response_curves_plot.png
Stage details
00_run_metadata
Always present. Created first.
Artefact
Format
Description
config.source.yaml
YAML
Verbatim copy of the source config for this run. On refresh, this is copied from the source run’s archived config.source.yaml.
config.resolved.yaml
YAML
Config after relative path resolution. Does not include runtime-only fields (output_dir, run_name).
Serialised Meridian model (requires google-meridian[schema]).
fit_metadata.json
JSON
Records FitConfig values and Meridian version.
30_model_assessment
Always present. Content varies by compatibility.
Artefact
Format
Condition
Description
diagnostics_bundle.json
JSON
Always
Diagnostics export manifest with status of each sub-export.
predictive_accuracy.csv
CSV
export_predictive_accuracy: true
Predictive accuracy per observation.
review_summary.json
JSON
export_review_summary: true
Meridian review battery results.
model_results_summary.html
HTML
Always
Meridian HTML model summary.
plots/model_fit.png
PNG
export_plots: true
Model fit visualisation.
plots/rhat_boxplot.png
PNG
export_plots: true
R-hat convergence diagnostic boxplot.
loo_summary.json
JSON
Compatible + export_model_selection: true
LOO summary.
waic_summary.json
JSON
Compatible + export_model_selection: true
WAIC summary.
loo_pointwise.csv
CSV
Compatible + export_model_selection: true
Per-observation LOO values.
waic_pointwise.csv
CSV
Compatible + export_model_selection: true
Per-observation WAIC values.
model_comparison.csv
CSV
Compatible + export_model_selection: true
Ranked model comparison.
model_selection_status.json
JSON
Incompatible + export_model_selection: true
Reason for unavailability.
40_decomposition
Always present.
Artefact
Format
Description
summary_metrics.nc
NetCDF
Full decomposition dataset with coordinates.
summary_metrics.csv
CSV
Flattened tabular decomposition.
plots/channel_contribution_area_chart.png
PNG
Channel contribution over time.
plots/contribution_waterfall_chart.png
PNG
Contribution waterfall breakdown.
plots/spend_vs_contribution_chart.png
PNG
Spend vs. contribution scatter.
plots/roi_bar_chart.png
PNG
ROI by channel bar chart.
60_response_curves
Present only when the response_curves YAML section is configured.
Artefact
Format
Description
response_curves.nc
NetCDF
Response curve dataset across spend multipliers.
response_curves.csv
CSV
Flattened tabular response curves.
plots/response_curves_plot.png
PNG
Response curve visualisation.
70_optimisation
Present only when the optimisation YAML section is configured.
Artefact
Format
Description
optimisation_summary.html
HTML
Meridian optimisation summary report.
optimised_data.nc
NetCDF
Optimised budget allocation.
optimised_data.csv
CSV
Tabular optimised allocation.
nonoptimised_data.nc
NetCDF
Baseline (non-optimised) allocation.
nonoptimised_data.csv
CSV
Tabular baseline allocation.
optimisation_grid.csv
CSV
Full optimisation grid dataset.
plots/incremental_outcome_delta_plot.png
PNG
Incremental outcome delta.
plots/budget_allocation_optimised_plot.png
PNG
Optimised allocation chart.
plots/budget_allocation_nonoptimised_plot.png
PNG
Baseline allocation chart.
plots/spend_delta_plot.png
PNG
Spend delta between optimised and baseline.
plots/optimisation_response_curves_plot.png
PNG
Optimisation response curves.
Reading order for analysts
For a quick assessment of a completed run:
run_manifest.json — run identity, timing, stage completion
00_run_metadata/config.source.yaml — what was authored
00_run_metadata/input_data_provenance.json — dataset identity and shape
30_model_assessment/diagnostics_bundle.json — diagnostics export state
30_model_assessment/model_results_summary.html — visual model summary
40_decomposition/summary_metrics.csv — easiest tabular output to inspect
For model selection:
30_model_assessment/loo_summary.json or model_selection_status.json
For scenario analysis:
60_response_curves/response_curves.csv
70_optimisation/optimisation_summary.html
Validation spec schema reference
The validation_spec.json artefact is written to 10_validation/ for every
validation-aware pipeline run. It records the concrete validation provenance
for that specific run, including the holdout strategy, split geometry, and
date windows.
Fields
Field
Type
Description
mode
"validation" | "final_fit"
Whether this is a validation split or the final production fit.
The actual holdout mask array (boolean NumPy array) is not stored in
validation_spec.json because it can be large for geo-panel models
(n_geos × n_times). Only its holdout_shape is recorded. The mask is
injected into the Meridian model at runtime and can be reconstructed from
train_indices, test_indices, and the data geometry.
validation-execution contract checks for incompatible single-run validation
combinations
a narrow wrapper-owned config/data preflight over the resolved input file
and authored column mapping
The wrapper-owned preflight checks exactly:
resolved data.path exists and is a regular file
the CSV header row can be read
the parsed header is non-empty
no parsed header cell is blank after trimming whitespace
every authored scalar entry in data.coord_to_columns exists in the header
every authored list member in data.coord_to_columns exists in the header
every authored key in media_to_channel, media_spend_to_channel,
reach_to_channel, frequency_to_channel, rf_spend_to_channel,
organic_reach_to_channel, and organic_frequency_to_channel exists in the
header
authored list-valued coord families are non-empty
authored mapping fields above are non-empty
supported media/RF family groups are complete when authored
Header matching is exact and case-sensitive. Anything outside this closed
matrix remains Meridian-owned validation.
Parameters:
run_config — A PipelineRunConfig specifying the execution config
path, output directory, run name, optional validation spec, and optional
source_config_path for metadata archival.
progress_callback — Optional callable invoked on stage lifecycle
events. The callback receives keyword arguments:
stage_name (str) — stage identifier.
event (str) — one of "started", "completed", "skipped", or
"failed".
stage_index (int) — 1-based position in the pipeline.
stage_count (int) — total number of stages.
elapsed_seconds (float) — wall-clock time (present for "completed"
and "failed" events).
message (str) — human-readable detail (present for "skipped" and
"failed" events).
Returns: A PipelineRunResult with the run directory and manifest path.
Raises:
RuntimeError if Meridian schema support is unavailable (checked at
preflight before the run directory is created).
RuntimeError if exports.export_plots is true but vl-convert-python
is not installed (also checked at preflight).
ValidationExecutionContractError if the requested single-run validation
execution path is incompatible with the authored config.
ConfigPreflightError if wrapper-owned config/data preflight fails before
run-directory creation.
PipelineRunFailure if any exception occurs after the dated run directory
already exists.
Disk locations for one completed meridian-tools run.
Attribute
Type
Description
run_dir
Path
Absolute path to the run directory.
manifest_path
Path
Absolute path to run_manifest.json.
ValidationExecutionContractError
classValidationExecutionContractError(ValueError)
Raised when the requested single-run validation execution path is incompatible
with the authored config. Current examples include direct rolling_origin
execution through run_pipeline(...) and combining
PipelineRunConfig.validation_spec with authored
model_spec.kwargs.holdout_id.
ConfigPreflightError
classConfigPreflightError(ValueError)
Raised when the wrapper-owned Phase 10 preflight fails before run-directory
creation. This covers only the closed wrapper preflight boundary, not full
Meridian model validation.
PipelineRunFailure
classPipelineRunFailure(RuntimeError)
Raised when a run fails after the dated run directory already exists.
The original underlying exception is preserved via __cause__.
Build a blocked-tail holdout mask for Meridian’s holdout_id.
Returns a 1-D boolean mask for national data and a 2-D (n_geos, n_times)
mask when geo_index is provided. The last holdout_size time periods are
marked as True (held out).
Parameters:
time_index — Strictly increasing sequence of time period identifiers.
holdout_size — Number of tail periods to hold out. Must be positive
and less than the length of time_index.
geo_index — Optional sequence of geo identifiers. If provided, the
mask is broadcast across geos.
Returns: Boolean NumPy array.
Raises:ValueError for non-monotonic indices, undersized indices, or
impossible holdout sizes.
Materialise concrete validation and final-fit run specs from one config.
For strategy: none, returns a plan with no validation runs and no
final-fit run. For blocked_tail or rolling_origin, returns one
ValidationRunSpec per split plus a final_fit_run spec that trains on
the full time axis with no holdout.
Parameters:
validation_config — A validated ValidationConfig instance.
time_index — Strictly increasing sequence of time period identifiers.
geo_index — Optional sequence of geo identifiers for geo-panel models.
One concrete validation or final-fit run derived from a split plan. Passed
to PipelineRunConfig.validation_spec to control a single pipeline
execution.
Attribute
Type
Description
mode
"validation" | "final_fit"
Run mode.
strategy
str
Validation strategy.
split_label
str
Human-readable split identifier.
holdout_source
str
How the holdout mask was produced.
generated_holdout
bool
Whether the holdout was auto-generated.
holdout_id
np.ndarray | None
Concrete holdout mask (immutable).
train_indices
tuple[int, ...]
Training time indices.
test_indices
tuple[int, ...]
Test time indices.
train_dates
tuple[str, ...]
Training date values.
test_dates
tuple[str, ...]
Test date values.
run_name_suffix
str
Suffix for the run directory name.
Methods:
to_artifact_payload() — Returns the JSON-serialisable dictionary
written to validation_spec.json.
ValidationPlan
@dataclass(frozen=True)classValidationPlan
Concrete validation runs and the separate final-fit run for one config.
Attribute
Type
Description
validation_runs
tuple[ValidationRunSpec, ...]
One spec per validation split.
final_fit_run
ValidationRunSpec | None
Full-sample final-fit spec. None for strategy: none.
meridian_tools.exports
Helpers for manifest-backed Meridian export families.
For budget.mode: relative_reference_window_total, the effective budget is
computed as value × total_spend_in_reference_window using the model’s media
and RF spend data within the start_date–end_date window.
Parameters:
model — Fitted Meridian model instance.
output_dir — Directory to write artefacts to.
optimisation_config — Optimisation settings from YAML.
exports_config — Export switches.
Returns: Dictionary mapping artefact names to file paths.
ensure_meridian_schema_support
defensure_meridian_schema_support()->Callable
Return Meridian’s schema serialiser or raise a stable runtime error.
Checks for meridian.schema.serde.meridian_serde.save_meridian. If the
import fails, raises RuntimeError with guidance to install
google-meridian[schema].
Returns: The save_meridian callable.
ensure_altair_png_support
defensure_altair_png_support()->Any
Return the Altair PNG backend or raise a stable runtime error.
Checks for vl_convert. If the import fails, raises RuntimeError with
guidance to install vl-convert-python.
Returns: The vl_convert module.
meridian_tools.diagnostics
Diagnostics extraction and export helpers for Meridian runs.
Write predictive accuracy, review summary, and bundle manifest to disk.
The bundle manifest (diagnostics_bundle.json) records the status of each
sub-export ("exported" or "disabled") along with the file name and
format. This provides a stable machine-readable contract for downstream
consumers.
When an export is disabled, any pre-existing file from a previous run at
the same path is removed to prevent stale data.
selected_geos — Not supported in current scope (raises ValueError).
selected_times — Not supported in current scope (raises ValueError).
batch_size — Batch size for Meridian analysis.
Returns: Dictionary mapping artefact names to file paths. Always
includes "diagnostics_bundle". Conditionally includes
"predictive_accuracy" and "review_summary".
Compute the pointwise log-likelihood dataset for a fitted Meridian model.
This function reconstructs the joint distribution from the posterior samples
and computes observation-level log-likelihood values. It handles both
geo-panel and national models.
The reconstruction recovers unsaved posterior parameters (e.g. geo
deviations, tau_g_excl_baseline) that Meridian does not persist to
InferenceData by default.
Parameters:
meridian_model — A fitted Meridian model with posterior samples and
a compatible posterior_sampler_callable.
Returns: An xarray Dataset with a log_likelihood variable.
Raises:ModelSelectionError if the model does not expose the required
internal reconstruction seams or lacks posterior samples.
Attach a log_likelihood group to a Meridian model’s InferenceData.
If the model’s InferenceData already has a non-empty log_likelihood
group, it is returned as-is (or the existing InferenceData is returned
for in_place=True).
Parameters:
meridian_model — A fitted Meridian model.
in_place — If True, mutates meridian_model.inference_data
directly. If False (default), returns a deep copy with the
log_likelihood group attached. The original model is never modified.
Returns: An ArviZ InferenceData with a log_likelihood group.
Raises:
ModelSelectionError with reason_code="meridian_internal_seam_incompatible"
if the Meridian version lacks the required private reconstruction methods.
ModelSelectionError with reason_code="requires_fitted_meridian_model" if
the model has no posterior samples.
ModelSelectionError with reason_code="holdout_fit_unsupported" if the
model was fitted with a holdout mask.
The reconstruction accesses three private methods on Meridian’s
posterior_sampler_callable:
_get_joint_dist_unpinned
_prepare_latents_for_reconstruction
_reconstruct_posteriors
These are Meridian-internal and may change without notice. If any method is
missing, a ModelSelectionError with
reason_code="meridian_internal_seam_incompatible" is raised instead of
crashing. See the
Meridian integration notes for
details on this coupling boundary.
meridian_tools.lifecycle
Post-run record management: loading, listing, comparing, and refreshing runs.
Module:meridian_tools.lifecycle
Functions
resolve_run_directory
defresolve_run_directory(path:str|Path)->Path
Return the absolute resolved run directory for a run path or manifest path.
If path points to a file, it must be named run_manifest.json; the
function returns its parent directory. If path is a directory, it must
contain run_manifest.json.
Parameters:
path — Path to a run directory or to run_manifest.json directly.
Returns: Absolute Path to the run directory.
Raises:LifecycleError if the path does not exist, is an unexpected
file, or the directory does not contain run_manifest.json.
load_run_record
defload_run_record(path:str|Path)->RunRecord
Load one run directory through the versioned lifecycle contract.
Resolves the run directory, parses the manifest, and resolves artefact
paths. Required artefacts (config_source, config_resolved) must be
present in the manifest and exist on disk. Manifest version 3 runs must also
include input_data_provenance. Optional artefacts (validation_spec,
diagnostics_bundle, model_selection_status) are resolved when present and
set to None when absent.
Parameters:
path — Path to a run directory or to run_manifest.json directly.
Returns: A validated RunRecord instance.
Raises:LifecycleError for missing required artefacts, malformed
manifests, artefact path traversal, or claimed-but-missing artefacts.
Discover direct child run directories under one output root.
Scans direct child directories of root for run_manifest.json files.
Returns records sorted by started_at (most recent first), with directory
name as a secondary sort key.
Parameters:
root — Directory to scan. Must be a directory, not a file.
Returns: List of RunRecord instances.
Raises:LifecycleError if root is not a directory or if any
discovered run has an invalid manifest.
Build a runtime refresh config from one stored run directory.
The execution config path points to the source run’s config.resolved.yaml.
The returned PipelineRunConfig.source_config_path preserves the source run’s
archived config.source.yaml so the refresh can re-copy the original YAML
into the new run metadata. The output directory defaults to the source run’s
parent directory (creating a sibling run). For validation runs, the
validation spec is reconstructed from the stored validation_spec.json.
Parameters:
path — Path to the run directory or manifest to refresh.
output_dir — Override the output directory (default: source parent).
run_name — Override the run name.
Returns: A PipelineRunConfig ready for run_pipeline.
Raises:LifecycleError if the source run cannot be loaded or if
authored-holdout refresh requirements are not met.
Compare two run records at the pinned metadata layer.
Loads both run records and compares run name, status, versions, validation
spec presence, diagnostics statuses, model selection availability, and
input-data provenance.
Parameters:
left — Path to the first run directory or manifest.
right — Path to the second run directory or manifest.
Returns: A pandas DataFrame with columns field, left, right,
status, and changed. Rows follow a fixed order:
Row (field)
Description
run_name
Human-readable run name.
status
Overall run status.
meridian_tools_version
meridian-tools version.
meridian_version
Google Meridian version.
has_validation_spec
Whether a validation spec is present.
has_diagnostics_bundle
Whether a diagnostics bundle is present.
predictive_accuracy_status
Status from the diagnostics bundle.
review_summary_status
Status from the diagnostics bundle.
has_model_selection_outputs
Whether LOO/WAIC outputs are present.
model_selection_reason_code
Reason code if model selection is unavailable.
input_authored_path
YAML-owned data.path string.
input_resolved_path
Absolute runtime input path.
input_mtime_utc
Input file mtime.
input_sha256
Input file SHA-256 digest.
input_size_bytes
Input file size in bytes.
input_row_count
Input row count.
input_column_count
Input column count.
input_ordered_columns
Input CSV column order.
For provenance rows, status is "legacy_unknown" and changed is None
when either run predates manifest version 3 and therefore has no stored
provenance payload.
Raises:LifecycleError if either run cannot be loaded or if
diagnostics or model selection artefacts are malformed.
Classes
RunRecord
@dataclass(frozen=True)classRunRecord
Resolved lifecycle view over one on-disk run directory.
Attribute
Type
Description
run_dir
Path
Absolute path to the run directory.
manifest_path
Path
Absolute path to run_manifest.json.
manifest
RunManifest
Parsed manifest with stages, timestamps, and versions.
config_source_path
Path
Absolute path to config.source.yaml. Always present.
config_resolved_path
Path
Absolute path to config.resolved.yaml. Always present.
input_data_provenance_path
Path | None
Path to input_data_provenance.json. Required for manifest version 3 runs, otherwise None.
validation_spec_path
Path | None
Path to validation_spec.json, or None if absent.
diagnostics_bundle_path
Path | None
Path to diagnostics_bundle.json, or None if absent.
model_selection_status_path
Path | None
Path to model_selection_status.json, or None if absent.
Required attributes (config_source_path, config_resolved_path) are
always present. input_data_provenance_path is present for manifest version
3 runs. Other optional attributes are None when the corresponding artefact
was not produced by the run or is absent from the manifest.
Example:
frommeridian_tools.lifecycleimportload_run_recordrecord=load_run_record("runs/my-project_blocked_tail_20260402_073500")# Required — always availableprint(record.config_source_path)print(record.config_resolved_path)# Optional — may be Noneifrecord.diagnostics_bundle_path:print(f"Diagnostics: {record.diagnostics_bundle_path}")ifrecord.validation_spec_path:print(f"Validation spec: {record.validation_spec_path}")
LifecycleError
classLifecycleError(RuntimeError)
Raised when a run directory cannot be loaded through the lifecycle
contract. All lifecycle functions raise this exception type instead of
generic ValueError or RuntimeError.
meridian_tools.artifacts
Manifest and JSON helpers for run artefact management.
Module:meridian_tools.artifacts
Functions
write_json
defwrite_json(path:str|Path,payload:Any)->None
Write a JSON-serialisable payload to disk with UTF-8 encoding and
2-space indentation. Creates parent directories if they do not exist.
Convert artefact paths to relative paths against run_dir so the manifest
stores portable references.
Parameters:
run_dir — The run directory root.
artifacts — Mapping of artefact names to file paths.
Returns: Dictionary mapping artefact names to relative path strings.
timestamp_utc
deftimestamp_utc()->str
Return the current time as a UTC ISO-8601 string with second precision.
Classes
RunManifest
@dataclassclassRunManifest
Machine-readable summary of one meridian-tools run.
Attribute
Type
Default
Description
run_name
str
required
Human-readable run name.
config_path
Path
required
Path to the authored YAML config file.
output_dir
Path
required
Path to the run directory.
started_at
str
required
UTC ISO-8601 start timestamp.
manifest_version
int
CURRENT_MANIFEST_VERSION
Schema version (0, 1, 2, or 3).
status
str
"running"
Overall run status: "running", "completed", or "failed".
finished_at
str | None
None
UTC ISO-8601 finish timestamp. None while the run is in progress.
meridian_tools_version
str
__version__
Version of meridian-tools.
meridian_version
str | None
None
Version of Google Meridian.
artifacts
dict[str, str]
{}
Top-level artefact index. Key artefacts from stages are promoted here.
stages
list[StageRecord]
[]
Ordered list of stage records (completed, skipped, and failed).
Class methods:
from_dict(payload: Mapping[str, Any]) -> RunManifest — Deserialise
from a JSON-parsed dictionary. Supports manifest versions 0, 1, 2, and 3 with
default values for missing fields in older versions. Raises ValueError
for unsupported versions or missing required fields.
Instance methods:
to_dict() -> dict[str, Any] — Serialise to a JSON-compatible
dictionary.
StageRecord
@dataclassclassStageRecord
One pipeline stage entry in the run manifest.
Attribute
Type
Default
Description
name
str
required
Stage identifier (for example, "00_run_metadata").
status
str
"pending"
Stage status: "pending", "running", "completed", "skipped", or "failed".
started_at
str | None
None
UTC ISO-8601 start timestamp.
finished_at
str | None
None
UTC ISO-8601 finish timestamp.
elapsed_seconds
float | None
None
Wall-clock seconds for stage execution.
message
str | None
None
Human-readable message (skip reason or error detail).
artifacts
dict[str, str]
{}
Map of artefact names to relative paths. Empty for skipped stages.
Class methods:
from_dict(payload: Mapping[str, Any]) -> StageRecord — Deserialise
from a JSON-parsed dictionary. Raises ValueError if name is missing.
InputDataProvenance
@dataclass(frozen=True)classInputDataProvenance
Pinned input-data provenance payload used by manifest version 3 runs.
Attribute
Type
Default
Description
authored_path
str
required
Exact data.path string from the source YAML.
resolved_path
str
required
Absolute runtime path used for input loading.
sha256
str
required
SHA-256 digest of the resolved input file.
size_bytes
int
required
Input file size in bytes.
mtime_utc
str
required
Input file modification time in UTC ISO-8601 format.
row_count
int
required
Number of CSV data rows.
column_count
int
required
Number of CSV columns.
ordered_columns
tuple[str, ...]
required
CSV header order.
provenance_version
int
INPUT_DATA_PROVENANCE_VERSION
Payload schema version.
Class methods:
from_dict(payload: Mapping[str, Any]) -> InputDataProvenance —
Validates the exact pinned Phase 09 key set and types.
Instance methods:
to_dict() -> dict[str, Any] — Serialise to the exact JSON payload
written into input_data_provenance.json.
These artefact entries are validated at run completion time by the runner.
New runs must produce all four to complete successfully.
The lifecycle loader enforces config_source and config_resolved as
required for all supported manifests. It also enforces
input_data_provenance for manifest version 3 runs. diagnostics_bundle
remains optional, so older or partial runs can still be loaded without it.
Concepts
Background material on architecture, design decisions, and Meridian integration boundaries.
Pages
Architecture — meridian-tools is a companion package designed for agency teams that use Google Meridian as their client-facing MMM (Marketing Mix Modelling) engine. It provides a stricter, more reproducible workflow around Meridian without forking the upstream library.
Design decisions — This document records the key design decisions in meridian-tools and the reasoning behind them. It is intended for maintainers and contributors who need to understand why things are built the way they are.
Meridian integration — This document describes how meridian-tools integrates with Google Meridian, the boundaries of that integration, and the risks associated with different coupling levels.
Subsections of Concepts
Architecture
meridian-tools is a companion package designed for agency teams that use
Google Meridian as their client-facing MMM (Marketing Mix Modelling) engine. It
provides a stricter, more reproducible workflow around Meridian without forking
the upstream library.
Core philosophy
No forking — meridian-tools strictly wraps Meridian. It does not modify
Meridian’s internal code or model implementations.
Reproducibility — All runs are driven by typed YAML configurations,
ensuring that models can be perfectly reproduced.
Structured workflow — The package enforces a staged execution pipeline
(validation, model fit, assessment, decomposition, response curves,
optimisation).
Lifecycle management — Runs are treated as immutable artefacts with rich
metadata, allowing for easy comparison, refreshing, and storage.
Meridian and TensorFlow are never imported at module level in the configuration,
validation, or CLI layers. This means lightweight operations respond instantly:
Operation
Imports loaded
meridian-tools --help
pydantic, yaml
load_yaml_config(path)
pydantic, yaml
build_validation_plan(...)
numpy
run_pipeline(...)
Everything (Meridian, TF, ArviZ, etc.)
The __init__.py uses __getattr__-based lazy loading so that
import meridian_tools does not trigger heavy dependency imports.
Pipeline execution model
The runner executes stages sequentially. Each stage:
Creates a StageRecord and appends it to the in-memory manifest.
Calls the stage function, which returns a dict[str, Path] of artefacts.
Normalises artefact paths to be relative to the run directory.
Writes the updated manifest to disk.
This design means a crash mid-pipeline leaves a readable partial manifest on
disk. The last entry in the stages array is the last successfully completed
stage.
The numbering gap at 50 reserves space for future stages without renumbering.
Configuration architecture
The separation between authored YAML and runtime-only config is strict:
MeridianToolsConfig — Pydantic model for the YAML file. Owns project
metadata, data paths, model spec, fit settings, validation strategy, and
export switches.
PipelineRunConfig — Frozen dataclass for runtime options. Owns output
directory, run name, and concrete validation spec.
The runner writes two config copies to each run directory:
config.source.yaml — Verbatim copy of the input YAML.
config.resolved.yaml — After relative path resolution. Never includes
runtime-only fields.
Artefact path normalisation
All artefact paths in manifests are stored relative to the run directory
through normalize_artifact_paths. This makes run directories portable across
machines.
The lifecycle layer resolves them back to absolute paths at load time.
The private-API coupling is confined to log_likelihood.py and wrapped in
comprehensive error handling. See
Meridian integration for details.
Data flow
Input — A typed YAML file defines the entire run scope.
Initialisation — The runner resolves the config and creates a timestamped
run directory.
Execution — The pipeline steps through stages, maintaining a central
state dictionary with the fitted model and intermediate results.
Export — Each stage writes specific artefacts to disk within the run
directory.
Finalisation — The manifest is completed with status: "completed" and
finished_at, locking the run state.
Lifecycle — Downstream processes or analysts consume artefacts or use
lifecycle tools to compare, refresh, or audit runs.
Design decisions
This document records the key design decisions in meridian-tools and the
reasoning behind them. It is intended for maintainers and contributors who
need to understand why things are built the way they are.
No IID cross-validation
Decision:meridian-tools does not implement random-shuffle or naive k-fold
cross-validation.
Reasoning: MMM data is time series. Random IID splits break temporal
structure, leading to data leakage where future observations inform training
and past observations appear in the test set. This produces optimistic accuracy
estimates that do not reflect real-world forecasting performance.
The package provides two time-respecting alternatives:
Blocked tail — reserves the most recent observations as a single test
block.
Rolling origin — expanding-window forward-chaining that respects temporal
ordering at every split.
Non-overlapping rolling-origin test windows
Decision:step_size must equal test_size for rolling-origin splits.
Reasoning: Overlapping test windows would mean the same observation appears
in multiple test sets. This violates the independence assumption needed for
comparing validation scores across splits and complicates the interpretation of
aggregate metrics. Non-overlapping windows ensure each observation is evaluated
exactly once across the split plan.
Minimum two splits for rolling origin
Decision:build_rolling_origin_splits requires at least two splits.
Reasoning: A single rolling-origin split is functionally identical to a
blocked-tail holdout and provides no comparative signal. If your data only
supports one split, use blocked_tail instead — it communicates the intent
more clearly.
Holdout restriction for model selection
Decision: LOO and WAIC are only available for models where
holdout_id is None.
Reasoning: LOO and WAIC estimate expected log predictive density (ELPD)
using the full observed likelihood surface. A model fitted with a holdout mask
has a modified likelihood that excludes held-out observations. Computing LOO on
this truncated likelihood would produce ELPD estimates that are not comparable
to those from full-sample fits.
The correct workflow is:
Use validation splits for candidate evaluation.
Select the best specification based on holdout performance.
Refit the chosen specification on the full dataset.
Compute LOO/WAIC on the full-sample fit for model quality reporting.
Separation of validation fits and final fits
Decision: Validation runs and final production fits are separate pipeline
executions that produce separate run directories.
Reasoning: A validation fit is trained on a subset of the data. Its
posterior reflects that subset and should not be used as the production
artefact. Keeping them as separate runs prevents accidental use of a validation
fit for downstream analysis or reporting.
Lazy imports for CLI responsiveness
Decision: Heavy dependencies (TensorFlow, NumPy, Meridian, ArviZ) are not
imported at module level in the config, CLI, or validation layers.
Reasoning: TensorFlow alone takes several seconds to import. The CLI must
respond instantly for --help and --list operations. The __init__.py uses
__getattr__-based lazy loading, and the test suite verifies that
build_parser() only loads pydantic and yaml.
Pydantic extra="forbid" everywhere
Decision: All configuration models reject unexpected keys.
Reasoning: Silent acceptance of unknown keys is a common source of
misconfiguration in YAML-driven tools. A typo like export_pridictive_accuracy
would be silently ignored without extra="forbid", leading to unexpected
default behaviour. Strict rejection catches these errors at config load time
with clear error messages.
Relative artefact paths in manifests
Decision: All artefact paths in run_manifest.json are stored relative to
the run directory.
Reasoning: Absolute paths would tie run directories to a specific machine
or filesystem layout. Relative paths make run directories portable — they can
be copied, archived, or moved between machines without breaking the manifest
contract.
Non-destructive lifecycle operations
Decision:refresh_run creates a new sibling directory rather than
overwriting the source.
Reasoning: Overwriting a validated production run would destroy the audit
trail. Creating a sibling preserves the original for comparison and rollback.
The lifecycle layer explicitly validates that source directories are not
mutated by refresh operations.
Manifest-per-stage persistence
Decision: The manifest is written to disk after each stage completes, not
only at the end of the pipeline.
Reasoning: MCMC sampling can run for minutes to hours. If the process
crashes or is killed during a later stage, the partial manifest on disk
reflects what completed successfully. This aids debugging and allows partial
runs to be inspected without special tooling.
Stage numbering with gaps
Decision: Pipeline stages use numbers 00, 10, 20, 30, 40, 60, 70 with a
gap at 50.
Reasoning: The gaps allow future stages to be inserted at natural positions
(e.g. a stage 50 for custom analysis) without renumbering existing stages.
Renumbering would break backward compatibility with stored manifests and any
downstream tooling that references stage names.
Config source vs. resolved archival
Decision: Both the verbatim source YAML and the resolved YAML are archived
in every run directory.
Reasoning: The source YAML shows what the analyst authored (including
relative paths). The resolved YAML shows the runtime interpretation (absolute
paths, defaults applied). Both are needed for reproducibility:
The source is needed to understand intent.
The resolved config is needed to reproduce the exact execution.
Runtime-only fields (output_dir, run_name, validation_spec) are
deliberately excluded from the resolved config because they are not part of
the reproducible model specification.
Structured model selection errors
Decision: Model selection failures produce ModelSelectionError with a
machine-readable reason_code rather than generic exceptions.
Reasoning: The pipeline needs to distinguish between “model selection is
not possible for this run type” (expected) and “something is broken”
(unexpected). Structured reason codes allow:
The runner to write model_selection_status.json without failing the run.
The lifecycle layer to compare model selection availability across runs.
Downstream consumers to programmatically handle different failure modes.
Meridian integration
This document describes how meridian-tools integrates with Google Meridian,
the boundaries of that integration, and the risks associated with different
coupling levels.
Integration philosophy
meridian-tools wraps Meridian without forking it. Meridian remains the
modelling engine; meridian-tools adds workflow orchestration, validation,
diagnostics bundling, model selection, and lifecycle management on top.
This approach means:
Meridian upgrades can be adopted without merging fork changes.
The upstream project’s API stability directly affects meridian-tools.
Any use of Meridian-internal APIs must be explicitly managed.
Coupling levels
Public API (low risk)
These are documented, versioned Meridian surfaces:
These are unlikely to break without a Meridian major version bump. The exact
google-meridian==1.5.3 pin keeps these assumptions aligned with the validated
release baseline.
Semi-public API (medium risk)
These are accessible attributes on Meridian model objects that are used but
not formally documented as stable:
Surface
Used by
Purpose
model.inference_data
log_likelihood.py, model_selection.py
Access ArviZ InferenceData
model.model_context
log_likelihood.py, exports.py
Access model structure
model.input_data
exports.py
Access input data for spend computation
model.posterior_sampler_callable
log_likelihood.py
Access posterior sampler
These are stable in practice (they are used by Meridian’s own analysis
surfaces) but are not guaranteed to be stable across versions.
Private API (high risk)
These are _-prefixed methods on Meridian’s posterior_sampler_callable,
used exclusively in log_likelihood.py for log-likelihood reconstruction:
These methods are Meridian-internal and may change or be removed in any
Meridian release, including patch versions. They are necessary because
Meridian does not provide a public API for pointwise log-likelihood
computation.
Risk mitigation
Compatibility guard
log_likelihood.py checks for the presence of all three private methods
before attempting reconstruction:
If any method is missing, the error is caught and recorded as a
model_selection_status.json artefact with
reason_code: meridian_internal_seam_incompatible. The rest of the pipeline
continues normally.
Graceful degradation
Model selection incompatibility is non-fatal at every level:
log_likelihood.py raises ModelSelectionError with a structured code.
model_selection.py propagates the error.
runner.py catches it, writes model_selection_status.json, and continues.
The manifest records the assessment stage as completed.
The lifecycle layer can inspect model_selection_status to understand why
model selection was unavailable.
Version pinning
The pyproject.toml pins Meridian to google-meridian[schema]==1.5.3. Any
Meridian upgrade must refresh the private log-likelihood reconstruction
baseline before the version guard is relaxed.
Integration testing
The test suite includes a gated live Meridian verification command:
one reduced real pipeline run over bundled demo data, including stored-run
refresh after the original YAML is removed
the lower-level live log-likelihood reconstruction path
It is excluded from the default test suite because it requires real MCMC
sampling, but it should be run after every Meridian version upgrade.
Constants dependency
log_likelihood.py uses Meridian constants for posterior parameter names:
frommeridianimportconstants# constants.BETA_GM, constants.TAU_G, constants.ETA_M, etc.
These are stable string constants but are not versioned. A Meridian release
that renames these constants would cause import-time failures.
Unsaved posterior parameter recovery
Meridian does not persist all posterior parameters to InferenceData. The
_recover_unsaved_state function in log_likelihood.py reconstructs:
tau_g_excl_baseline — Recovered from the posterior’s tau_g variable
by slicing out the baseline geo index (concatenating the elements before and
after baseline_geo_idx).
Geo deviations — Recovered from the posterior by solving
deviation = (target - base) / scale for normal effects, or
deviation = (log(target) - base) / scale for log-normal effects, with a
scale == 0 guard that maps to zero.
This recovery is mathematically correct for the supported model families
(log-normal and normal media effects). It is tested against both geo-panel
and national models in test_log_likelihood.py.
What breaks on a Meridian upgrade
Change type
Impact
Detection
Public API signature change
runner.py, exports.py break
Default test suite
Semi-public attribute rename
log_likelihood.py, exports.py break
Default test suite
Private method removal/rename
Model selection disabled
Live smoke test or model_selection_status.json
Constant rename
Import-time failure
Default test suite
New posterior parameter
Log-likelihood may be incorrect
Manual review + live smoke test
Changed likelihood formula
Log-likelihood may be incorrect
Live smoke test
Recommended upgrade procedure
Pin the new Meridian version in a branch.
Run the full default test suite: pytest tests/ -v.
Run the live Meridian verification command:
MERIDIAN_TOOLS_ENABLE_REAL_FIT=1 pytest tests/test_demo_integration.py::test_real_pipeline_refresh_smoke tests/test_log_likelihood.py::test_compute_log_likelihood_dataset_real_posterior_smoke -m real_fit -v.
If model selection breaks, check model_selection_status.json for the
reason code.
If private methods changed, update log_likelihood.py to match the new
Meridian internals or accept graceful degradation.
Update docs/project/release-baseline.md with the new verified state.
Project
Contributor-facing project documentation, release baselines, and changelog material.
Pages
Contributing — This guide covers the development setup, conventions, and workflow for contributing to meridian-tools.
Acceptance checklist — Use this page as the canonical local acceptance checklist for the current repository state. Run the commands in this order. The acceptance gate is local and command-driven. It does not depend on CI, GitHub Actions, or unpublished helper scripts.
Release baseline — This page records the current milestone release baseline for the repository. Treat it as a validated project state, not as an automated release system. The baseline uses the same local command sequence as the acceptance checklist and records the observed warning profile, the direct runtime dependency bounds, and the accepted trade-offs that still shape the package.
Changelog — All notable changes to meridian-tools are documented in this file.
Subsections of Project
Contributing
This guide covers the development setup, conventions, and workflow for
contributing to meridian-tools.
Development setup
Clone and install
git clone <repo-url> meridian-tools
cd meridian-tools
pip install -e ".[dev]"
All public functions and classes use type annotations. The codebase uses
from __future__ import annotations for forward-reference support.
Import conventions
Standard library imports first, then third-party, then local.
Heavy dependencies (Meridian, TensorFlow, ArviZ) are imported lazily inside
functions, not at module level, in the config/CLI/validation layers.
Ruff rule I enforces import sorting.
Configuration models
All Pydantic models use ConfigDict(extra="forbid"). New config fields must
be added with appropriate types, defaults, and validators.
Testing
Running tests
# Full suitepytest tests/ -v
# Specific filepytest tests/test_runner.py -v
# Specific testpytest tests/test_runner.py::test_run_pipeline_writes_manifest -v
Test conventions
Tests use pytest with tmp_path for temporary directories.
monkeypatch is used extensively to mock Meridian internals and isolate
unit tests from real MCMC sampling.
Module-scoped fixtures (scope="module") are used for expensive model
construction in test_log_likelihood.py and test_model_selection.py.
Shared test infrastructure is defined inline in individual test modules.
There is no top-level conftest.py.
Live Meridian verification
One opt-in command exercises the bounded real Meridian seam:
This is not part of the default suite. It proves one reduced real pipeline run
over bundled demo data, one stored-run refresh after the original YAML is
removed, and the lower-level live log-likelihood seam. Run it after Meridian
version upgrades and before release-candidate handoff when you want extra
confidence beyond the fast suite.
Writing new tests
Place tests in the appropriate tests/test_<module>.py file.
Use monkeypatch to avoid real MCMC sampling in unit tests.
Test both success paths and error conditions.
Verify artefact file contents, not just their existence.
The version is defined in src/meridian_tools/version.py:
__version__="0.3.0"
Version bumps are manual edits. Update this file when preparing a release.
Documentation
Documentation lives in docs/. When adding new features:
Update relevant guide or reference pages.
Add API documentation for new public functions or classes.
Update the YAML schema reference if config fields changed.
Update the output schema if new artefacts are produced.
Common pitfalls
Do not import Meridian at module level in config, CLI, or validation
modules. This breaks CLI responsiveness.
Do not add extra="allow" to Pydantic models. The extra="forbid"
policy prevents silent misconfiguration.
Do not modify source run directories in lifecycle operations. Always
create new sibling directories.
Do not weaken or delete existing tests without explicit direction.
Acceptance checklist
Use this page as the canonical local acceptance checklist for the current
repository state. Run the commands in this order. The acceptance gate is local
and command-driven. It does not depend on CI, GitHub Actions, or unpublished
helper scripts.
Acceptance gate
Run the following commands from the repository root:
The canonical acceptance-gate result for the last command is:
244 passed, 2 skipped
That result is the pass or fail line for the default local acceptance gate.
The recorded warning profile belongs to the release baseline, not to the
acceptance-gate definition itself.
What each command proves
python -m compileall src tests proves that the checked-in Python files parse
cleanly. If this step fails, you are dealing with a syntax or import-time
parse issue and you should stop there.
ruff check src tests proves that the repository still satisfies the pinned
lint rules. If this step fails, fix the reported lint violations before moving
on.
ruff format --check src tests proves that the checked-in files still match
the agreed formatting contract. If this step fails, run the formatter and then
rerun the verification sequence.
mypy src proves that the configured static typing baseline still runs
cleanly. If this step fails, either fix the reported type issue or update the
documented ratchet intentionally.
python -m pip install -e . --no-deps proves that the package still builds
and installs in editable mode from the local source tree. If this fails, treat
it as a packaging or build-metadata break rather than a test-only problem.
meridian-tools --help proves that the published CLI entrypoint still resolves
and that the lightweight command surface still imports cleanly. If this step
fails, check the package entrypoint and import boundary before continuing.
pytest tests/ -v proves the behavioural contract of the repository. This is
the broadest local validation step. If it fails, use the failing test names to
identify which package contract regressed.
How to interpret failure
If the compile step fails, fix syntax or parse problems first. The later steps
will not give you useful signal until that is resolved.
If lint, format, or type checks fail, treat that as a source-tree quality
issue, not as an optional clean-up item. Bring the tree back to the pinned
Ruff and mypy state before trusting the rest of the loop.
If editable install fails, treat the repository as not ready for contributor
handoff. The package must install cleanly before the test result matters.
If CLI help fails, assume the published command surface is broken even if the
Python modules still import manually.
If pytest tests/ -v fails, the acceptance gate is not met. A partial pass is
not enough. Fix the failing behavioural contract and rerun the full command
sequence.
Optional extra confidence
The repository also carries one opt-in live Meridian verification command for
extra technical confidence:
This command is not part of the default blocking acceptance gate. It exists to
provide one bounded live Meridian route that proves:
real pipeline execution over bundled demo data
manifest-backed stored-run refresh after the original YAML is removed
the lower-level live log-likelihood reconstruction seam
On the reference development environment, the recorded run finished in 185.42
seconds (0:03:05); keep a budget of roughly six minutes or less for this
extra-confidence command.
Release baseline
This page records the current milestone release baseline for the repository.
Treat it as a validated project state, not as an automated release system. The
baseline uses the same local command sequence as the acceptance checklist and
records the observed warning profile, the direct runtime dependency bounds, and
the accepted trade-offs that still shape the package.
Release-ready definition in this repository
The repository is release-ready only when the documented local acceptance
command set passes, pytest tests/ -v returns the recorded pass/skip count
below, the same validated run is recorded with the observed warning count, the
warning categories match the accepted ones below, and the accepted trade-offs
remain explicit rather than hidden.
That command remains opt-in local confidence, not the default developer loop or
silent CI policy. On the reference development environment, the recorded run
finished in 185.42 seconds (0:03:05); keep a budget of roughly six minutes or
less for ordinary local execution.
Runtime dependency boundary
The current runtime boundary recorded from pyproject.toml is:
requires-python >=3.11
google-meridian==1.5.3
arviz>=0.18.0,<0.20.0
pandas>=2.2.0,<3
pydantic>=2.8.0,<3
PyYAML>=6.0.0,<7
These are the direct runtime dependency bounds for the milestone baseline. This
page does not imply broader environment reproducibility than the repository
currently implements.
Accepted warning profile
The recorded 60 warnings are accepted in the current milestone baseline.
They fall into two pinned categories:
Meridian model / prior warnings
ArviZ model-selection warnings
This baseline does not pretend the repository is warning-free. It records the
current observed warning profile honestly and treats those warning categories as
accepted for the present milestone.
Accepted trade-offs
The current release baseline also depends on several explicit trade-offs.
The package takes a no-fork Meridian approach. We keep Meridian as the
modelling engine and add workflow and compatibility tooling around it rather
than modifying Meridian source.
Bayesian model selection remains intentionally limited to fitted Meridian
models where holdout_id is None. Validation-fit and authored-holdout runs are
not treated as compatible LOO or WAIC candidates.
Lifecycle tooling remains Python-first. The repository does not currently ship
a broader lifecycle CLI.
Version bumping remains a manual edit rather than a fully automated release
pipeline.
Boundary of this record
This page records one validated milestone state. It does not introduce CI as
the source of truth. It does not define publish automation. It does not promise
zero warnings. It does not claim a broader release process than the repository
actually supports today.
Changelog
All notable changes to meridian-tools are documented in this file.
CLI single source of truth — runme.py now delegates directly to
meridian_tools.cli, removing duplicate root-level argument parsing.
Typed runner state — Pipeline orchestration now uses PipelineContext
for shared stage state.
Shared posterior sampling — Runner posterior sampling keyword mapping is
centralized in one helper.
Lifecycle comparison schema — Run comparison rows are generated from
declarative comparison field descriptors.
Meridian compatibility pin — The package pins
google-meridian[schema]==1.5.3, and log-likelihood reconstruction refuses
unvalidated Meridian versions.
Static analysis tooling — Development extras now include mypy, and
Ruff enables additional complexity, simplification, and Ruff-specific rule
families.
Fixed
Optimized Python safety — Validation helpers now use explicit exceptions
instead of assert for runtime invariants.
Shared confidence validation — Response curve and optimisation configs
share one confidence_level validator.
Export coercion documentation — NetCDF attribute coercion now documents
its input-to-output type mapping.
[0.2.0] — 2026-04-07
Added
Docs site build — Hugo-based website documentation under docs-site/,
generated from the repository Markdown set by
docs-site/build_content.py.
Manifest v3 provenance — Explicit input_data_provenance capture for
stored runs and lifecycle refresh or compare workflows.
Typed failure boundaries — ConfigPreflightError,
ValidationExecutionContractError, and PipelineRunFailure distinguish
wrapper-owned preflight, validation contract misuse, and post-directory
runtime failures.
Bounded live verification — An opt-in Meridian real-fit smoke route
gated behind MERIDIAN_TOOLS_ENABLE_REAL_FIT=1.
Module-path CLI contract — Explicit support and regression coverage for
python -m meridian_tools.cli ....
Changed
Shared launch flow — meridian-tools and the repo-root runme.py
launcher now share one launch flow for config loading, preflight checks,
progress reporting, and terminal success or failure output.
Packaged demo assets — Bundled demo configs and datasets are resolved
from packaged _demo_data, so demo runs work from installed wheels as well
as source checkouts.
Default demo fit mode — Bundled demos now default to full-sample fits
(validation.strategy: none), so loo_summary.json and waic_summary.json
are generated by default and 10_validation is recorded as skipped.
Refresh contract — Stored-run refresh now reloads from the saved
resolved config while preserving the original source config copy in run
metadata.
Lifecycle compare semantics — Compare now distinguishes legacy runs
without dataset provenance from real dataset changes.
Documentation layout — Public documentation is reorganised under docs/
into getting-started, guides, reference, concepts, and project sections.
Fixed
Structured public entrypoint failures — Missing or invalid config paths
in public entrypoints now produce structured failure output instead of raw
Python tracebacks unless --traceback is used.
Relative-path refresh — Refreshing a stored run with relative
data.path input no longer depends on the original source config location
remaining present on disk.
Partial-run failure reporting — Failed runs that already created an
output directory now report the concrete run directory, manifest path, and
failing stage through the CLI and runme.py.
Docs-site theme resolution — Hugo builds resolve the Relearn theme
through a pinned module dependency instead of requiring a local theme
checkout.
[0.1.0] — 2026-04-02
Added
Typed YAML configuration — Pydantic-validated config with extra="forbid"
strictness for all sections: project, data, model_spec, fit,
validation, exports, response_curves, optimisation.
Staged pipeline runner — Sequential execution through 00_run_metadata,
10_validation, 20_model_fit, 30_model_assessment, 40_decomposition,
60_response_curves, 70_optimisation with manifest persistence after each
stage.
Validation orchestration — blocked_tail and rolling_origin time-series
validation strategies with auto-generated holdout masks. Authored holdout
passthrough through model_spec.kwargs.holdout_id.
Diagnostics bundling — diagnostics_bundle.json manifest with optional
predictive_accuracy.csv and review_summary.json exports.
Bayesian model selection — Compatibility-aware LOO and WAIC computation
through ArviZ, with automatic log-likelihood reconstruction for fitted Meridian
models. Graceful degradation for incompatible runs through structured
ModelSelectionError with reason codes.
Response curves export — Configurable spend multiplier grid with NetCDF
and CSV outputs.
Optimisation export — Fixed-budget and relative-budget optimisation with
full artefact set including allocation charts.
Plot exports — PNG plot artefacts through Altair/vl-convert for model fit,
diagnostics, decomposition, response curves, and optimisation stages.
Lifecycle management — load_run_record, list_run_records,
build_refresh_run_config, compare_run_records for post-run analysis and
reproducible refresh workflows.
CLI — meridian-tools run and meridian-tools demo subcommands with
lightweight imports for fast startup.
Bundled demos — timeseries and geo_panel reference workflows with
packaged data and configs.
Manifest versioning — Support for manifest versions 0, 1, and 2 with
backward-compatible deserialisation.
Comprehensive test suite — 218 tests across 15 test files covering
configuration, validation, pipeline execution, exports, diagnostics, model
selection, lifecycle, and demos.