Skip to main content

Data Sources

Impulse reads measurement data from a Silver layer and writes structured analytics results to a Gold layer. Both layers are stored in Unity Catalog as Delta Lake tables.


Silver layer (input)

The Silver layer contains five tables that represent measurement data in a normalized, tag-based model. All table names follow Unity Catalog naming: catalog.schema.table.

container_tags

Key-value metadata tags for measurement containers.

ColumnTypeNullableDescription
container_idlongNoUnique container identifier.
keystringYesTag key (e.g. "vehicle_key", "project_id").
valuestringYesTag value.

container_metrics

Numeric metadata for each container (timestamps, duration, channel count).

ColumnTypeNullableDescription
container_idlongNoUnique container identifier.
start_dttimestampYesContainer start datetime.
stop_dttimestampYesContainer stop datetime.
duration_msintYesTotal duration in milliseconds.
num_channelsintYesNumber of channels in the container.

channel_tags

Key-value metadata tags for individual channels within a container.

ColumnTypeNullableDescription
container_idlongNoParent container identifier.
channel_idintNoChannel identifier within the container.
keystringYesTag key (e.g. "channel_name", "brand", "model").
valuestringYesTag value.

channel_metrics

Numeric metadata and pre-computed statistics for individual channels.

ColumnTypeNullableDescription
container_idlongNoParent container identifier.
channel_idintNoChannel identifier.
value_typestringYesData type of channel values.
sample_countintYesNumber of samples.
nan_ratiofloatYesRatio of NaN values.
begin_sfloatYesChannel start time (seconds).
end_sfloatYesChannel end time (seconds).
duration_msintYesChannel duration in milliseconds.
original_sample_countintYesSample count before processing.
original_srfloatYesOriginal sample rate.
minfloatYesMinimum value.
maxfloatYesMaximum value.
meanfloatYesMean value.
stdfloatYesStandard deviation.
pz1floatYes1st percentile.
pz10floatYes10th percentile.
pz90floatYes90th percentile.
pz99floatYes99th percentile.

channels

The actual time-series data. The table supports two format variants:

RLE format (default)

Pre-encoded with Run-Length Encoding. Each row represents one sample interval [tstart, tend) with a constant value.

ColumnTypeNullableDescription
container_idlongNoParent container identifier.
channel_idintNoChannel identifier.
tstartlongNoSample start timestamp (microseconds).
tendlongNoSample end timestamp (microseconds).
valuedoubleYesSample value.
is_plausiblebooleanYesWhether the data point is plausible.

Raw format (requires data_type: RAW)

Raw timestamp-based data without RLE encoding. Each row represents a single sample at a point in time. When data_type is set to RAW in the config, the framework automatically derives tend from subsequent timestamps and transforms data into a supported format before query execution.

ColumnTypeNullableDescription
container_idlongNoParent container identifier.
channel_idintNoChannel identifier.
timestamplongNoSample timestamp (microseconds).
valuedoubleYesSample value.
is_plausiblebooleanYesWhether the data point is plausible.

Consecutive rows for the same channel form a SampleSeries.


Gold layer (output)

The Gold layer uses a star schema to store analytics results. All table names are prefixed with the configured table_prefix. For example, with table_prefix = "my_report", tables are named my_report_histogram_fact, my_report_histogram_dimension, etc.

Schema overview

Fact tables

TableKey ColumnsDescription
{prefix}_histogram_factcontainer_id, visual_id, event_id, bin_id1D histogram bin values per container.
{prefix}_histogram2d_factcontainer_id, visual_id, event_id, x_bin_id, y_bin_id2D histogram bin values per container.
{prefix}_stats_aggregator_factcontainer_id, visual_id, event_instance_id, channel_name, aggregation_labelStatistics values per signal, event instance, and container.
{prefix}_event_instance_factcontainer_id, event_id, event_instance_idMaterialized event occurrences with start/end timestamps.

Dimension tables

TableKey ColumnsDescription
{prefix}_histogram_dimensionvisual_id, report_idHistogram metadata (name, bins, signal info, units).
{prefix}_histogram2d_dimensionvisual_id, report_id2D histogram metadata (axes, bins, signal info, units).
{prefix}_stats_aggregator_dimensionvisual_id, report_idStatistics metadata (signals, aggregation labels).
{prefix}_event_dimensionevent_id, report_idEvent definitions (name, expression, required channels).
{prefix}_measurement_dimensioncontainer_idContainer metadata from configured measurement dimensions.

Configuration

Data source and sink are configured in the report's JSON configuration file. See the Report reference for the full configuration schema.

Source configuration

Maps Silver layer tables to Unity Catalog paths:

{
"source": {
"container_metrics_table": "catalog.schema.container_metrics",
"channel_metrics_table": "catalog.schema.channel_metrics",
"channels_uri": "catalog.schema.channels",
"container_tags_table": "catalog.schema.container_tags",
"channel_tags_table": "catalog.schema.channel_tags"
}
}
FieldRequiredDescription
container_metrics_tableYesContainer timestamps and numeric metadata.
channel_metrics_tableYesChannel-level statistics.
channels_uriYesRaw time-series data.
container_tags_tableNoContainer key-value tags.
channel_tags_tableNoChannel key-value tags.

Sink configuration

Defines where Gold layer tables are written:

{
"unity_sink": {
"catalog": "my_catalog",
"schema": "gold",
"table_prefix": "my_report"
}
}
FieldRequiredDescription
catalogYesTarget Unity Catalog name.
schemaYesTarget schema name.
table_prefixYesPrefix for all generated table names.

Measurement dimensions

Controls which columns from container_metrics are included in the measurement_dimension table:

{
"measurement_dimensions": ["container_id", "vehicle_key", "start_ts", "stop_ts"]
}

Available dimension values: container_id, uut_id, vehicle_key, file_name, source_file_path, start_ts, stop_ts, environment, project_id.