Configuration guide

meridian-tools is driven by one YAML configuration file. This guide explains every section, its purpose, and its constraints. For a field-level schema reference, see yaml-schema.md.

Configuration philosophy

The YAML file owns the authored project definition: project metadata, data paths, model specification, fit settings, validation strategy, and export switches. Runtime-only values — output_dir, run_name, and concrete validation_spec — belong in PipelineRunConfig or CLI flags, not in the YAML file. This separation ensures that the same YAML file can drive multiple runs with different runtime options while remaining reproducible.

Minimal valid config

project:
  name: my-project

data:
  path: ./data.csv
  coord_to_columns:
    time: week

This is the smallest config that will pass validation. It uses defaults for everything else: no validation, all exports enabled, no response curves, no optimisation.

Templates

The repository includes commented starter configs under templates/. These templates all use the same canonical schema; they differ only in which optional sections are authored.

minimal.yml — smallest valid config.
standard.yml — typical model run with common data, model, fit, validation, export, and response-curve settings.
media-priors.yml — standard run with YAML-driven media priors.
validation-blocked-tail.yml — standard run with blocked-tail validation.

Copy the closest template into your project directory, update data.path and the authored column names, then run it with meridian-tools run --config.

Section reference

`project`

Top-level project metadata.

project:
  name: client-mmm        # Default: "meridian-project"

name — Human-readable project name. Used as the base for run directory names unless overridden by --run-name at runtime.

`data`

CSV data loader configuration. Maps directly to Meridian’s CsvDataLoader.

data:
  path: ./client_dataset.csv
  kpi_type: revenue                    # "revenue" (default) or "non-revenue"
  coord_to_columns:
    time: week
    geo: market                        # optional for national models
    kpi: revenue
    population: population
    media: [impressions_tv, impressions_search]
    media_spend: [spend_tv, spend_search]
    controls: [promo_flag, price_index]
  media_to_channel: null               # optional channel mapping overrides
  media_spend_to_channel: null
  reach_to_channel: null
  frequency_to_channel: null
  rf_spend_to_channel: null
  organic_reach_to_channel: null
  organic_frequency_to_channel: null

path — Path to the CSV data file. Relative paths are resolved against the directory containing the YAML config file, not the current working directory.
kpi_type — Either "revenue" or "non-revenue". Controls how Meridian interprets the KPI column.
coord_to_columns — Maps Meridian coordinate names to CSV column names. time is required. geo is optional (omit for national models).

`model_spec`

Raw keyword arguments forwarded to Meridian’s ModelSpec.

model_spec:
  kwargs:
    max_lag: 8
    media_prior_type: roi

kwargs — Dictionary passed through to ModelSpec(**kwargs). Supports any argument that Meridian’s ModelSpec accepts.
Special handling for holdout_id: if present in kwargs, the run is treated as an “authored holdout” validation run. See the validation guide for details.

Custom Media Priors

The optional priors subsection configures a focused YAML surface for media priors that would otherwise require Python code.

model_spec:
  kwargs:
    max_lag: 8
    media_prior_type: roi
  priors:
    roi_m:
      default:
        distribution: LogNormal
        loc: 0.2
        scale: 0.9
      channels:
        paid_search:
          distribution: TruncatedNormal
          loc: 3.0
          scale: 1.5
          low: 0.0
          high: 6.0
    alpha_m:
      distribution: Beta
      concentration0: 1.0
      concentration1: 2.0

Supported prior parameters are roi_m, mroi_m, and alpha_m. Each accepts either a scalar distribution or a default plus per-channel overrides. Channel override names must match media_to_channel values, not raw CSV column names.

Supported distributions are Normal, LogNormal, TruncatedNormal, and Beta. model_spec.priors and model_spec.kwargs.prior are mutually exclusive.

`fit`

Sampling configuration for Meridian posterior fitting.

fit:
  sample_prior_draws: null     # Optional prior-only sampling
  n_chains: 4                  # Number of MCMC chains
  n_adapt: 500                 # Adaptation steps per chain
  n_burnin: 500                # Burn-in steps per chain
  n_keep: 1000                 # Posterior samples to keep per chain
  seed: 20260331               # Reproducibility seed (int, list[int], or null)
  max_tree_depth: 10           # NUTS max tree depth
  max_energy_diff: 500.0       # NUTS max energy difference
  unrolled_leapfrog_steps: 1   # NUTS leapfrog steps
  parallel_iterations: 10      # TF parallel iterations

All fields have sensible defaults. Override only what you need.

seed — Accepts a single integer, a list of integers (one per chain), or null for non-deterministic sampling.
sample_prior_draws — If set, prior predictive samples are drawn before posterior sampling. This is optional and primarily for model diagnostics.

`validation`

Validation and holdout orchestration settings. See the validation guide for strategy selection advice.

# Option 1: No validation (default)
validation:
  strategy: none

# Option 2: Blocked tail
validation:
  strategy: blocked_tail
  holdout_size: 8

# Option 3: Rolling origin
validation:
  strategy: rolling_origin
  initial_train_size: 52
  test_size: 4
  step_size: 4          # Must equal test_size
  max_splits: 3         # At least 2

strategy — One of "none", "blocked_tail", or "rolling_origin".
holdout_size — Required for blocked_tail. Number of time periods to hold out from the end of the series.
initial_train_size, test_size — Required for rolling_origin.
step_size — Optional for rolling_origin. Must equal test_size if set. Defaults to test_size.
max_splits — Optional for rolling_origin. Must be at least 2.

Validation rules:

blocked_tail rejects rolling-origin parameters.
rolling_origin rejects holdout_size.
none rejects all holdout and rolling-origin parameters.
Legacy holdout_size without explicit strategy is rejected.

`exports`

Output switches for diagnostics and model-selection artefacts.

exports:
  use_kpi: false                       # Use KPI-based metrics
  batch_size: 1000                     # Batch size for Meridian analysis
  export_predictive_accuracy: true     # Write predictive_accuracy.csv
  export_review_summary: true          # Write review_summary.json
  export_model_selection: true         # Write LOO/WAIC outputs
  export_plots: true                   # Write PNG plot artefacts

All fields have defaults. If the entire exports section is omitted, all exports are enabled with default settings.

`response_curves`

Optional. If omitted, the response curves stage is skipped.

response_curves:
  spend_multipliers: [0.0, 0.5, 1.0, 1.5, 2.0]
  use_posterior: true
  by_reach: true
  use_optimal_frequency: false
  confidence_level: 0.9

spend_multipliers — Required. Non-empty list of non-negative floats.
confidence_level — Must be strictly between 0 and 1.

`optimisation`

Optional. If omitted, the optimisation stage is skipped.

optimisation:
  start_date: "2025-01-01"
  end_date: "2025-12-31"
  budget:
    mode: fixed_total                  # or "relative_reference_window_total"
    value: 1000000.0
  use_posterior: true
  use_optimal_frequency: true
  confidence_level: 0.9

start_date, end_date — ISO format YYYY-MM-DD. end_date must be on or after start_date.
budget.mode — Either "fixed_total" (absolute budget) or "relative_reference_window_total" (multiplier against the reference window’s total spend).
budget.value — Positive float. For fixed_total, this is the absolute budget. For relative_reference_window_total, this is a multiplier (e.g. 1.1 means 110% of the reference window total).

Validation strictness

All configuration models use Pydantic’s extra="forbid" mode. Any unexpected key in the YAML file will produce a clear validation error. This prevents silent misconfiguration from typos or outdated keys.

$ meridian-tools run --config bad.yml
# pydantic.ValidationError: 1 validation error for MeridianToolsConfig
# exports -> export_pridictive_accuracy
#   Extra inputs are not permitted

Path resolution

Relative paths in data.path are resolved against the directory containing the YAML config file, not the current working directory. This means:

# If config is at /workspace/configs/project.yml
data:
  path: ../inputs/weekly.csv
# Resolves to /workspace/inputs/weekly.csv

The resolved path is written to config.resolved.yaml in the run directory. The original authored path is preserved in config.source.yaml.

Wrapper-owned preflight

Before meridian-tools creates a dated run directory, it performs one narrow wrapper-owned preflight check on the authored config and the resolved input CSV. Phase 10 keeps this boundary intentionally small so the wrapper does not become a second Meridian schema layer.

The wrapper checks exactly:

the resolved data.path exists and is a regular file
the CSV header row can be read
the parsed header is non-empty
no parsed header cell is blank after trimming whitespace
every authored scalar entry in data.coord_to_columns exists in the header
every authored list member in data.coord_to_columns exists in the header
every authored key in data.media_to_channel exists in the header
every authored key in data.media_spend_to_channel exists in the header
every authored key in data.reach_to_channel exists in the header
every authored key in data.frequency_to_channel exists in the header
every authored key in data.rf_spend_to_channel exists in the header
every authored key in data.organic_reach_to_channel exists in the header
every authored key in data.organic_frequency_to_channel exists in the header
authored list-valued coord families are non-empty
authored mapping fields above are non-empty
coord_to_columns.media and media_to_channel must be authored together
coord_to_columns.media_spend and media_spend_to_channel must be authored together
coord_to_columns.reach, coord_to_columns.frequency, reach_to_channel, and frequency_to_channel must be authored together
coord_to_columns.rf_spend and rf_spend_to_channel must be authored together
coord_to_columns.organic_reach and organic_reach_to_channel must be authored together
coord_to_columns.organic_frequency and organic_frequency_to_channel must be authored together

Matching is exact and case-sensitive. The wrapper does not normalise headers, apply aliases, or use fuzzy matching.

What remains Meridian-owned:

deep ModelSpec semantics
fit-dependent tensor or shape constraints
statistical validity checks that depend on model construction or sampling

So Phase 10 moves obvious wrapper-detectable mistakes earlier, but it does not promise to catch everything Meridian may reject later.

Full example

project:
  name: client-mmm

data:
  path: ./client_dataset.csv
  kpi_type: revenue
  coord_to_columns:
    time: week
    geo: market
    kpi: revenue
    population: population
    media: [impressions_tv, impressions_search]
    media_spend: [spend_tv, spend_search]
    controls: [promo_flag, price_index]

model_spec:
  kwargs:
    max_lag: 8
    media_prior_type: roi

fit:
  n_chains: 4
  n_adapt: 500
  n_burnin: 500
  n_keep: 1000
  seed: 20260331

validation:
  strategy: blocked_tail
  holdout_size: 8

exports:
  export_predictive_accuracy: true
  export_review_summary: true
  export_model_selection: true

response_curves:
  spend_multipliers: [0.0, 0.5, 1.0, 1.5, 2.0]
  use_posterior: true
  by_reach: true
  use_optimal_frequency: false
  confidence_level: 0.9

optimisation:
  start_date: "2025-01-01"
  end_date: "2025-12-31"
  budget:
    mode: fixed_total
    value: 1000000.0
  use_posterior: true
  use_optimal_frequency: true
  confidence_level: 0.9