Skip to main content

databricks.labs.dqx.quality_checker.quality_checker_runner

QualityCheckerRunner Objects

class QualityCheckerRunner()

Runs the DQX data quality on the input data and saves the generated results to delta table(s).

run

def run(checks: list[dict],
input_config: InputConfig,
output_config: OutputConfig,
quarantine_config: OutputConfig | None,
custom_check_functions: dict[str, str] | None = None,
reference_tables: dict[str, InputConfig] | None = None) -> None

Run the DQX data quality job on the input data and saves the generated results to delta table(s).

Arguments:

  • checks - The data quality checks to apply.
  • input_config - Input data configuration (e.g. table name or file location, read options).
  • output_config - Output data configuration (e.g. table name or file location, write options).
  • quarantine_config - Quarantine data configuration (e.g. table name or file location, write options).
  • custom_check_functions - A mapping where each key is the name of a function (e.g., "my_func") and each value is the file path to the Python module that defines it. The path can be absolute or relative to the installation folder, and may refer to a local filesystem location, a Databricks workspace path (e.g. /Workspace/my_repo/my_module.py), or a Unity Catalog volume (e.g. /Volumes/catalog/schema/volume/my_module.py).
  • reference_tables - Reference tables to use in the checks.