databricks.labs.dqx.profiler.profile
DQProfile Objects
@dataclass(frozen=True)
class DQProfile()
Data quality profile class representing a data quality rule candidate.
DQProfileBuilder Objects
@dataclass(frozen=True)
class DQProfileBuilder()
Data quality profile builder class: a named builder that may produce a DQProfile for a column.
Attributes:
-
name- Profile type identifier (e.g. "null_or_empty", "is_in", "min_max"). Used to look up the builder in the registry and in generated rule metadata. -
builder- Callable that inspects column data and options and returns a DQProfile when the column matches the profile criteria, otherwise None. Signature:(df, column_name, column_type, profiler_metrics, profiler_options) -> DQProfile | None
- df: DataFrame for this column (non-null rows only; strings trimmed when profiler_options["trim_strings"] is True). Used for distinct/min/max etc.
- column_name: Name of the column being profiled.
- column_type: Spark DataType of the column (e.g. StringType(), LongType()).
- profiler_metrics: Column-level statistics from the profiler (e.g. count, count_null, empty_count, count_non_null). Same key set as summary_stats[column_name].
- profiler_options: Profiler options for this run (e.g. max_null_ratio, max_empty_ratio, max_in_count, trim_strings, filter).