Configuration guide

meridian-tools is driven by one YAML configuration file. This guide explains every section, its purpose, and its constraints. For a field-level schema reference, see yaml-schema.md.

Configuration philosophy

The YAML file owns the authored project definition: project metadata, data paths, model specification, fit settings, validation strategy, and export switches. Runtime-only values — output_dir, run_name, and concrete validation_spec — belong in PipelineRunConfig or CLI flags, not in the YAML file. This separation ensures that the same YAML file can drive multiple runs with different runtime options while remaining reproducible.

Minimal valid config

project:
  name: my-project

data:
  path: ./data.csv
  coord_to_columns:
    time: week

This is the smallest config that will pass validation. It uses defaults for everything else: no validation, all exports enabled, no response curves, no optimisation.

Section reference

project

Top-level project metadata.

project:
  name: client-mmm        # Default: "meridian-project"
  • name — Human-readable project name. Used as the base for run directory names unless overridden by --run-name at runtime.

data

CSV data loader configuration. Maps directly to Meridian’s CsvDataLoader.

data:
  path: ./client_dataset.csv
  kpi_type: revenue                    # "revenue" (default) or "non-revenue"
  coord_to_columns:
    time: week
    geo: market                        # optional for national models
    kpi: revenue
    population: population
    media: [impressions_tv, impressions_search]
    media_spend: [spend_tv, spend_search]
    controls: [promo_flag, price_index]
  media_to_channel: null               # optional channel mapping overrides
  media_spend_to_channel: null
  reach_to_channel: null
  frequency_to_channel: null
  rf_spend_to_channel: null
  organic_reach_to_channel: null
  organic_frequency_to_channel: null
  • path — Path to the CSV data file. Relative paths are resolved against the directory containing the YAML config file, not the current working directory.
  • kpi_type — Either "revenue" or "non-revenue". Controls how Meridian interprets the KPI column.
  • coord_to_columns — Maps Meridian coordinate names to CSV column names. time is required. geo is optional (omit for national models).

model_spec

Raw keyword arguments forwarded to Meridian’s ModelSpec.

model_spec:
  kwargs:
    max_lag: 8
    media_prior_type: roi
  • kwargs — Dictionary passed through to ModelSpec(**kwargs). Supports any argument that Meridian’s ModelSpec accepts.
  • Special handling for holdout_id: if present in kwargs, the run is treated as an “authored holdout” validation run. See the validation guide for details.

fit

Sampling configuration for Meridian posterior fitting.

fit:
  sample_prior_draws: null     # Optional prior-only sampling
  n_chains: 4                  # Number of MCMC chains
  n_adapt: 500                 # Adaptation steps per chain
  n_burnin: 500                # Burn-in steps per chain
  n_keep: 1000                 # Posterior samples to keep per chain
  seed: 20260331               # Reproducibility seed (int, list[int], or null)
  max_tree_depth: 10           # NUTS max tree depth
  max_energy_diff: 500.0       # NUTS max energy difference
  unrolled_leapfrog_steps: 1   # NUTS leapfrog steps
  parallel_iterations: 10      # TF parallel iterations

All fields have sensible defaults. Override only what you need.

  • seed — Accepts a single integer, a list of integers (one per chain), or null for non-deterministic sampling.
  • sample_prior_draws — If set, prior predictive samples are drawn before posterior sampling. This is optional and primarily for model diagnostics.

validation

Validation and holdout orchestration settings. See the validation guide for strategy selection advice.

# Option 1: No validation (default)
validation:
  strategy: none

# Option 2: Blocked tail
validation:
  strategy: blocked_tail
  holdout_size: 8

# Option 3: Rolling origin
validation:
  strategy: rolling_origin
  initial_train_size: 52
  test_size: 4
  step_size: 4          # Must equal test_size
  max_splits: 3         # At least 2
  • strategy — One of "none", "blocked_tail", or "rolling_origin".
  • holdout_size — Required for blocked_tail. Number of time periods to hold out from the end of the series.
  • initial_train_size, test_size — Required for rolling_origin.
  • step_size — Optional for rolling_origin. Must equal test_size if set. Defaults to test_size.
  • max_splits — Optional for rolling_origin. Must be at least 2.

Validation rules:

  • blocked_tail rejects rolling-origin parameters.
  • rolling_origin rejects holdout_size.
  • none rejects all holdout and rolling-origin parameters.
  • Legacy holdout_size without explicit strategy is rejected.

exports

Output switches for diagnostics and model-selection artefacts.

exports:
  use_kpi: false                       # Use KPI-based metrics
  batch_size: 1000                     # Batch size for Meridian analysis
  export_predictive_accuracy: true     # Write predictive_accuracy.csv
  export_review_summary: true          # Write review_summary.json
  export_model_selection: true         # Write LOO/WAIC outputs
  export_plots: true                   # Write PNG plot artefacts

All fields have defaults. If the entire exports section is omitted, all exports are enabled with default settings.

response_curves

Optional. If omitted, the response curves stage is skipped.

response_curves:
  spend_multipliers: [0.0, 0.5, 1.0, 1.5, 2.0]
  use_posterior: true
  by_reach: true
  use_optimal_frequency: false
  confidence_level: 0.9
  • spend_multipliers — Required. Non-empty list of non-negative floats.
  • confidence_level — Must be strictly between 0 and 1.

optimisation

Optional. If omitted, the optimisation stage is skipped.

optimisation:
  start_date: "2025-01-01"
  end_date: "2025-12-31"
  budget:
    mode: fixed_total                  # or "relative_reference_window_total"
    value: 1000000.0
  use_posterior: true
  use_optimal_frequency: true
  confidence_level: 0.9
  • start_date, end_date — ISO format YYYY-MM-DD. end_date must be on or after start_date.
  • budget.mode — Either "fixed_total" (absolute budget) or "relative_reference_window_total" (multiplier against the reference window’s total spend).
  • budget.value — Positive float. For fixed_total, this is the absolute budget. For relative_reference_window_total, this is a multiplier (e.g. 1.1 means 110% of the reference window total).

Validation strictness

All configuration models use Pydantic’s extra="forbid" mode. Any unexpected key in the YAML file will produce a clear validation error. This prevents silent misconfiguration from typos or outdated keys.

$ meridian-tools run --config bad.yml
# pydantic.ValidationError: 1 validation error for MeridianToolsConfig
# exports -> export_pridictive_accuracy
#   Extra inputs are not permitted

Path resolution

Relative paths in data.path are resolved against the directory containing the YAML config file, not the current working directory. This means:

# If config is at /workspace/configs/project.yml
data:
  path: ../inputs/weekly.csv
# Resolves to /workspace/inputs/weekly.csv

The resolved path is written to config.resolved.yaml in the run directory. The original authored path is preserved in config.source.yaml.

Wrapper-owned preflight

Before meridian-tools creates a dated run directory, it performs one narrow wrapper-owned preflight check on the authored config and the resolved input CSV. Phase 10 keeps this boundary intentionally small so the wrapper does not become a second Meridian schema layer.

The wrapper checks exactly:

  • the resolved data.path exists and is a regular file
  • the CSV header row can be read
  • the parsed header is non-empty
  • no parsed header cell is blank after trimming whitespace
  • every authored scalar entry in data.coord_to_columns exists in the header
  • every authored list member in data.coord_to_columns exists in the header
  • every authored key in data.media_to_channel exists in the header
  • every authored key in data.media_spend_to_channel exists in the header
  • every authored key in data.reach_to_channel exists in the header
  • every authored key in data.frequency_to_channel exists in the header
  • every authored key in data.rf_spend_to_channel exists in the header
  • every authored key in data.organic_reach_to_channel exists in the header
  • every authored key in data.organic_frequency_to_channel exists in the header
  • authored list-valued coord families are non-empty
  • authored mapping fields above are non-empty
  • coord_to_columns.media and media_to_channel must be authored together
  • coord_to_columns.media_spend and media_spend_to_channel must be authored together
  • coord_to_columns.reach, coord_to_columns.frequency, reach_to_channel, and frequency_to_channel must be authored together
  • coord_to_columns.rf_spend and rf_spend_to_channel must be authored together
  • coord_to_columns.organic_reach and organic_reach_to_channel must be authored together
  • coord_to_columns.organic_frequency and organic_frequency_to_channel must be authored together

Matching is exact and case-sensitive. The wrapper does not normalise headers, apply aliases, or use fuzzy matching.

What remains Meridian-owned:

  • deep ModelSpec semantics
  • fit-dependent tensor or shape constraints
  • statistical validity checks that depend on model construction or sampling

So Phase 10 moves obvious wrapper-detectable mistakes earlier, but it does not promise to catch everything Meridian may reject later.

Full example

project:
  name: client-mmm

data:
  path: ./client_dataset.csv
  kpi_type: revenue
  coord_to_columns:
    time: week
    geo: market
    kpi: revenue
    population: population
    media: [impressions_tv, impressions_search]
    media_spend: [spend_tv, spend_search]
    controls: [promo_flag, price_index]

model_spec:
  kwargs:
    max_lag: 8
    media_prior_type: roi

fit:
  n_chains: 4
  n_adapt: 500
  n_burnin: 500
  n_keep: 1000
  seed: 20260331

validation:
  strategy: blocked_tail
  holdout_size: 8

exports:
  export_predictive_accuracy: true
  export_review_summary: true
  export_model_selection: true

response_curves:
  spend_multipliers: [0.0, 0.5, 1.0, 1.5, 2.0]
  use_posterior: true
  by_reach: true
  use_optimal_frequency: false
  confidence_level: 0.9

optimisation:
  start_date: "2025-01-01"
  end_date: "2025-12-31"
  budget:
    mode: fixed_total
    value: 1000000.0
  use_posterior: true
  use_optimal_frequency: true
  confidence_level: 0.9