Skip to main content

databricks.labs.dqx.anomaly.training_service

Anomaly training service - Main orchestration layer.

Provides the high-level API for training anomaly detection models, including context building, validation, and both global and segmented training. All training logic lives on AnomalyTrainingService (public and private methods).

AnomalyTrainingService Objects

class AnomalyTrainingService()

Service for building training context and orchestrating model training.

Provides the main entry point for training anomaly detection models. Supports both global models and segment-specific models.

Extension point: To add new algorithms, implement AnomalyTrainingStrategy and pass to constructor.

__init__

def __init__(spark: SparkSession,
strategy: AnomalyTrainingStrategy | None = None) -> None

Initialize the training service.

apply_expected_anomaly_rate_if_default_contamination

@staticmethod
def apply_expected_anomaly_rate_if_default_contamination(
params: AnomalyParams | None,
expected_anomaly_rate: float) -> AnomalyParams

Apply expected_anomaly_rate to params if contamination is not explicitly set.

build_context

def build_context(df: DataFrame, model_name: str, registry_table: str, *,
columns: list[str] | None, segment_by: list[str] | None,
params: AnomalyParams | None,
exclude_columns: list[str] | None,
expected_anomaly_rate: float) -> AnomalyTrainingContext

Build training context with all validated inputs.

train

def train(context: AnomalyTrainingContext) -> str

Train model(s) based on context.