Skip to main content

Configuration

ImpulseConfig configures everything about a report: the silver-layer input tables, the gold-layer output location, container-level filters, the query-engine solver, incremental processing, and which container columns get surfaced into the gold-layer measurement dimension. Configuration is defined as JSON (or an equivalent Python dictionary) and validated using Pydantic models. The canonical schema lives in src/impulse_reporting/config/config_parser.py.

Quick example

{
"source": {
"container_metrics_table": "my_catalog.silver.container_metrics",
"channel_metrics_table": "my_catalog.silver.channel_metrics",
"channels_uri": "my_catalog.silver.channels",
"container_tags_table": "my_catalog.silver.container_tags",
"channel_tags_table": "my_catalog.silver.channel_tags"
},
"unity_sink": {
"catalog": "my_catalog",
"schema": "gold",
"table_prefix": "my_report"
},
"query_engine": {
"solver": "DeltaSolver",
"data_type": "RAW"
},
"container_filters": {
"tag_filters": [
[
{ "tag_name": "uut_id", "comparator": "==", "value": "ABC123", "cast_type": "string" }
]
],
"metric_filters": [
[
{ "column_name": "start_dt", "comparator": ">=", "value": "2025-04-27T05:20:54.000Z", "value_type": "timestamp" }
]
]
},
"measurement_dimensions": ["container_id", "vehicle_key", "start_ts", "stop_ts"]
}

A configuration is passed to Report either as a Python dict (config=...) or as a JSON file path (config_path=...). Sinkless mode is also supported — see Sinkless reports.


source

Maps the silver-layer input tables.

FieldTypeRequiredDescription
container_metrics_tablestrYesFull Unity Catalog path. Container metadata (timestamps, duration).
channel_metrics_tablestrYesFull Unity Catalog path. Channel-level statistics.
channels_uristrYesFull Unity Catalog path. Time-series sample data.
container_tags_tablestrNoFull Unity Catalog path. Container EAV tags.
channel_tags_tablestrNoFull Unity Catalog path. Channel EAV tags.
channel_mapping_tablestrNoFull Unity Catalog path. Logical-to-physical channel alias table. Required when using QueryBuilder.channel_with_alias() (currently supported by KeyValueStoreSolver).

Tag tables are required for solvers that consume tag-based filters (DeltaSolver with tag filters, KeyValueStoreSolver).


unity_sink

Defines where gold-layer tables are written.

FieldTypeRequiredDescription
catalogstrYesTarget catalog name.
schemastrYesTarget schema name.
table_prefixstrYesPrefix for all generated table names.

Output tables are named {table_prefix}_{entity} (e.g. my_report_histogram_fact).

Sinkless reports

unity_sink is optional. When omitted, the report runs in sinkless mode: determine_report() still computes events, aggregations, and container dimensions and exposes them on the report object, but persist_results() becomes a no-op. Useful for ad-hoc analysis, notebooks, and tests where writing to Unity Catalog is not desired.


container_filters (optional)

Restricts the set of processed containers. Filters are expressed in disjunctive normal form (OR of ANDs): each inner list is AND-combined, the outer list is OR-combined.

Two independent filter families:

  • tag_filters — applied on container_tags_table (EAV key/value model).
  • metric_filters — applied on container_metrics_table (columnar model).
FieldTypeDefaultDescription
tag_filterslist[list[TagFilter]][]Tag-based filter groups (DNF).
metric_filterslist[list[MetricFilter]][]Metric-based filter groups (DNF).

TagFilter

FieldTypeRequiredDescription
tag_namestrYesTag key to filter on.
comparatorstrYesOne of ==, !=, >, >=, <, <=.
valueanyYesExpected value. Must match cast_type.
cast_typestrNostring (default), int, double, or timestamp (ISO-format string).

MetricFilter

FieldTypeRequiredDescription
column_namestrYesColumn on container_metrics_table to filter on (e.g. start_dt, stop_dt). When solver_config.container_metrics.column_name_mapping is set, this refers to the internal name (after renaming).
comparatorstrYesOne of ==, !=, >, >=, <, <=.
valueanyYesExpected value. Must match value_type when provided.
value_typestrNoWhen provided, validates/converts the value: string, int, double, timestamp.

query_engine (optional)

FieldTypeDefaultDescription
solverstr"KeyValueStoreSolver"One of "DeltaSolver", "KeyValueStoreSolver". "KeyValueStoreSolver" works either against a narrow EAV container_tags table or, when source.container_tags_table is omitted, against a wide-only data model where container attributes live directly on container_metrics.
data_typestr"RLE""RLE" (intervals [tstart, tend)) or "RAW" (raw timestamps; converted to RLE before aggregation).
drop_implausible_databoolfalseWhen true, drops channels rows where is_plausible = false. Requires data_type = "RAW"; combining with "RLE" raises a validation error.
batch_sizeint500Maximum number of selectors solved per batch.
solver_configSolverConfignullPer-table column mappings, per-table equality filters, and project scoping. Set project_id to scope reads by project — it is applied to container_tags (if configured), container_metrics, and channel_mapping (if configured), so it works in both narrow EAV and wide-only data models. Omit it when you don't need project scoping. See Solver column mappings and filters.

If query_engine is omitted, the default is KeyValueStoreSolver with data_type = "RLE".


Solver column mappings and filters

The framework references columns by a fixed set of internal names (e.g. container_id, channel_id, tstart, tend, value). When your silver-layer tables use different physical names, declare the mapping in solver_config so the solver renames each table's columns at read time.

SolverConfig has one section per silver table. Each section is a TableConfig with two fields:

  • column_name_mapping (dict[str, str]): { "physical_column": "internal_column" }. The mapping is applied once, when the table is read. All downstream processing (filters, joins, aggregations) uses the internal names.
  • filters (dict[str, str]): equality filters applied after renaming. Keys are internal column names; values are literals to match. Useful for project/toolbox scoping where a single value should always be enforced.

Top-level fields on SolverConfig:

  • project_id (str, optional): Project scope. When set, the solver applies an equality filter on the project_id column (after column-name mapping) of every table it reads that carries one — container_tags (if configured), container_metrics, and channel_mapping (if configured). Omit it if you don't need project-level scoping; the solver does not require it.

Per-table sections (each a TableConfig):

SectionUsed byTypical mappings
container_tagsDeltaSolver, KeyValueStoreSolverentity_id → container_id, custom EAV key/value columns
container_metricsAll solversCustom container_id column, custom timestamp columns
channel_tagsDeltaSolverTag key/value column renames
channel_metricsAll solversCustom channel_id column, custom value/timestamp columns
channel_mappingKeyValueStoreSolverAlias-table column renames; priority column
channelsAll solversRLE column renames (tstart/tend/value)

Internal column names that mappings can target:

Internal nameDescription
container_idContainer identifier
channel_idChannel identifier
tstart, tendSample interval start/end (RLE)
valueSample value (or attribute value on the EAV tag table)
keyAttribute key on the EAV container_tags table
priorityTie-breaker column on the channel_mapping table
project_idProject scoping column
parent_idParent/scope identifier
Per-solver feature support

solver_config in your JSON config is forwarded to both KeyValueStoreSolver and DeltaSolver by the Report factory. However, only the parts each solver supports are actually consumed:

  • KeyValueStoreSolver uses all sections: per-table column_name_mapping, per-table filters, and top-level project_id.
  • DeltaSolver uses only the per-table column_name_mapping entries on container_tags, container_metrics, channel_tags, channel_metrics, and channels. Per-table filters, top-level project_id, and the channel_mapping section (alias resolution) are silently ignored — the solver class does not read them.

Example: KeyValueStoreSolver with renamed columns and per-table filters

"query_engine": {
"solver": "KeyValueStoreSolver",
"solver_config": {
"project_id": "my_project",
"container_tags": {
"column_name_mapping": {"entity_id": "container_id"},
"filters": {"parent_id": "my_parent_id"}
},
"container_metrics": {
"column_name_mapping": {"start_dt": "tstart", "stop_dt": "tend"}
},
"channel_metrics": {
"column_name_mapping": {}
},
"channel_mapping": {
"column_name_mapping": {},
"filters": {"toolbox_id": "my_toolbox"}
},
"channels": {
"column_name_mapping": {}
}
}
}

Sections you don't customize can be omitted; defaults are an empty mapping and no filters.

When to use what

  • solver_config.<table>.column_name_mapping — your silver-layer column is named differently from the framework's internal name (e.g. entity_id instead of container_id).
  • container_filters.tag_filters / metric_filters — choose which containers participate in this particular run (supports comparators, OR/AND combinations, and type casting). Refer to internal column names when solver_config rewrites them.

incremental (optional)

Incremental processing reuses results from prior runs for unchanged definitions and reprocesses only containers that are new or have been updated in silver. See the Report reference for mode-resolution rules and what counts as a definition change.

FieldTypeDefaultDescription
enabledboolfalseTurns incremental processing on.
silver_last_modified_columnstr"timestamp"Silver-side column used to detect container updates.
gold_last_modified_columnstr"_created_at"Gold-side column used to detect prior-run freshness.

measurement_dimensions (optional)

List of container_metrics columns to surface into the gold-layer measurement_dimension table.

Allowed values:

ValueDescription
container_idContainer identifier.
uut_idUnit-under-test identifier.
project_idProject identifier.
vehicle_keyVehicle identifier.
file_nameSource measurement file name.
source_file_pathFull path to the source file.
start_tsMeasurement start timestamp.
stop_tsMeasurement stop timestamp.
environmentRecording environment (e.g. PUMA, datalogger).

Default:

[
"container_id",
"uut_id",
"file_name",
"source_file_path",
"start_ts",
"stop_ts",
"project_id",
"environment"
]

Pick the entries that match the columns actually present in your container_metrics_table. Columns referenced here must exist in your silver schema; columns that don't appear here are ignored even if they exist in silver.