Provided by Databricks Labs
Impulse is a Python-based analytics library designed for
processing large-scale time-series measurement data.
Capabilities
Time-Series Query Language
Express signal arithmetic, event conditions, and aggregations in TSAL — a concise, Matlab-style Python syntax.
Pluggable Query Engine
Compile TSAL expressions into distributed Spark execution via interchangeable solvers tuned to each silver-layer layout.
Domain-Specific Data Model
Measurement recordings modeled as containers of channels, each enriched with container- and channel-level attributes and metrics.
Domain-Aware Aggregations
Compute histograms, 2D heatmaps, and event-scoped statistics, weighted by duration, distance, or a custom expression.
Event Detection
Define events from boolean signal logic and extract event instances with start/end timestamps.
Channel Scalability
Supports and scales to thousands of channels with different sampling rates, handling diverse sensor data simultaneously.
PySpark Native
Built on Apache Spark and Delta Lake for distributed processing of petabyte-scale sensor data.
Star Schema Output
Persist results to a normalized gold layer with dimension and fact tables.
Unity Catalog Integration
Keep outputs governed and discoverable in enterprise Databricks lakehouse environments.
Config-Driven Setup
Control source tables, sink targets, and dimensions from JSON configuration files.