Skip to main content

databricks.labs.dqx.anomaly.scoring_run

Global and segmented anomaly model scoring.

Provides score_global_model, score_segmented, and load_segment_models. Kept in one module to avoid over-fragmentation of the scoring layer.

score_global_model

def score_global_model(df: DataFrame, record: AnomalyModelRecord,
config: ScoringConfig) -> DataFrame

Score using a global (non-segmented) model.

load_segment_models

def load_segment_models(registry_client: AnomalyModelRegistry,
config: ScoringConfig) -> list[AnomalyModelRecord]

Load all segment models for a base model from the registry.

score_single_segment

def score_single_segment(segment_df: DataFrame,
segment_model: AnomalyModelRecord,
config: ScoringConfig,
max_groups_override: int | None = None,
endpoint_reachable: bool | None = None) -> DataFrame

Score a single segment with its specific model.

max_groups_override, when set, replaces config.max_groups in the ExplanationContext for this segment only. Used by score_segmented to enforce a global cap on LLM calls across segments — without it, config.max_groups applies independently per segment and the worst-case total is num_segments * max_groups.

endpoint_reachable is the serving-endpoint reachability probed once by score_segmented for the whole run; threading it through avoids one billable ai_query probe per segment.

score_segmented

def score_segmented(
df: DataFrame,
config: ScoringConfig,
registry_client: AnomalyModelRegistry,
all_segments: list[AnomalyModelRecord] | None = None) -> DataFrame

Score DataFrame using segment-specific models.