Skip to main content

databricks.labs.dqx.checks_storage

ChecksStorageHandler Objects

class ChecksStorageHandler(ABC, Generic[T])

Abstract base class for handling storage of quality rules (checks).

load

@abstractmethod
def load(config: T) -> list[dict]

Load quality rules from the source. The returned checks can be used as input for apply_checks_by_metadata or apply_checks_by_metadata_and_split functions.

Arguments:

  • config - configuration for loading checks, including the table location and run configuration name.

Returns:

list of dq rules or raise an error if checks file is missing or is invalid.

save

@abstractmethod
def save(checks: list[dict], config: T) -> None

Save quality rules to the target.

TableChecksStorageHandler Objects

class TableChecksStorageHandler(ChecksStorageHandler[TableChecksStorageConfig]
)

Handler for storing quality rules (checks) in a Delta table in the workspace.

load

@telemetry_logger("load_checks", "table")
def load(config: TableChecksStorageConfig) -> list[dict]

Load checks (dq rules) from a Delta table in the workspace.

Arguments:

  • config - configuration for loading checks, including the table location and run configuration name.

Returns:

list of dq rules or raise an error if checks table is missing or is invalid.

Raises:

  • NotFound - if the table does not exist in the workspace

save

@telemetry_logger("save_checks", "table")
def save(checks: list[dict], config: TableChecksStorageConfig) -> None

Save checks to a Delta table in the workspace.

Arguments:

  • checks - list of dq rules to save
  • config - configuration for saving checks, including the table location and run configuration name.

Raises:

  • InvalidCheckError - If any check is invalid or unsupported.

LakebaseChecksStorageHandler Objects

class LakebaseChecksStorageHandler(
ChecksStorageHandler[LakebaseChecksStorageConfig])

Handler for storing dq rules (checks) in a Lakebase table.

get_table_definition

@staticmethod
def get_table_definition(schema_name: str, table_name: str) -> Table

Create a SQLAlchemy table definition for storing DQ rules (checks) in Lakebase.

Arguments:

  • schema_name - The schema where the checks table is located.
  • table_name - The table where the checks are stored.

Returns:

SQLAlchemy table definition for the Lakebase instance.

load

@telemetry_logger("load_checks", "lakebase")
def load(config: LakebaseChecksStorageConfig) -> list[dict]

Load dq rules (checks) from a Lakebase table.

Arguments:

  • config - Configuration for saving and loading checks to Lakebase.

Returns:

List of dq rules or error if loading checks fails.

Raises:

  • NotFound - If the table does not exist in the Lakebase instance.
  • ProgrammingError - If SQL syntax errors or missing objects (converted to NotFound for missing tables).
  • DatabaseError - If other database operations fail (includes OperationalError, IntegrityError, etc.).

save

@telemetry_logger("save_checks", "lakebase")
def save(checks: list[dict], config: LakebaseChecksStorageConfig) -> None

Save dq rules (checks) to a Lakebase table.

Arguments:

  • checks - List of dq rules (checks) to save.
  • config - Configuration for saving and loading checks to Lakebase.

Returns:

None

Raises:

  • InvalidCheckError - If any check is invalid or unsupported.
  • IntegrityError - If constraint violations occur (e.g., duplicate keys).
  • ProgrammingError - If SQL syntax errors or missing objects.
  • DatabaseError - If other database operations fail (includes OperationalError, DataError, etc.).

WorkspaceFileChecksStorageHandler Objects

class WorkspaceFileChecksStorageHandler(
ChecksStorageHandler[WorkspaceFileChecksStorageConfig])

Handler for storing quality rules (checks) in a file (json or yaml) in the workspace.

load

@telemetry_logger("load_checks", "workspace_file")
def load(config: WorkspaceFileChecksStorageConfig) -> list[dict]

Load checks (dq rules) from a file (json or yaml) in the workspace. This does not require installation of DQX in the workspace.

Arguments:

  • config - configuration for loading checks, including the file location and storage type.

Returns:

list of dq rules or raise an error if checks file is missing or is invalid.

Raises:

  • NotFound - if the checks file is not found in the workspace.
  • InvalidCheckError - if the checks file cannot be parsed.

save

@telemetry_logger("save_checks", "workspace_file")
def save(checks: list[dict], config: WorkspaceFileChecksStorageConfig) -> None

Save checks (dq rules) to yaml file in the workspace. This does not require installation of DQX in the workspace.

Arguments:

  • checks - list of dq rules to save
  • config - configuration for saving checks, including the file location and storage type.

FileChecksStorageHandler Objects

class FileChecksStorageHandler(ChecksStorageHandler[FileChecksStorageConfig])

Handler for storing quality rules (checks) in a file (json or yaml) in the local filesystem.

load

def load(config: FileChecksStorageConfig) -> list[dict]

Load checks (dq rules) from a file (json or yaml) in the local filesystem.

Arguments:

  • config - configuration for loading checks, including the file location.

Returns:

list of dq rules or raise an error if checks file is missing or is invalid.

Raises:

  • FileNotFoundError - if the file path does not exist
  • InvalidCheckError - if the checks file cannot be parsed

save

def save(checks: list[dict], config: FileChecksStorageConfig) -> None

Save checks (dq rules) to a file (json or yaml) in the local filesystem.

Arguments:

  • checks - list of dq rules to save
  • config - configuration for saving checks, including the file location.

Raises:

  • FileNotFoundError - if the file path does not exist

InstallationChecksStorageHandler Objects

class InstallationChecksStorageHandler(
ChecksStorageHandler[InstallationChecksStorageConfig],
InstallationMixin)

Handler for storing quality rules (checks) defined in the installation configuration.

load

@telemetry_logger("load_checks", "installation")
def load(config: InstallationChecksStorageConfig) -> list[dict]

Load checks (dq rules) from the installation configuration.

Arguments:

  • config - configuration for loading checks, including the run configuration name and method.

Returns:

list of dq rules or raise an error if checks file is missing or is invalid.

Raises:

  • NotFound - if the checks file or table is not found in the installation.
  • InvalidCheckError - if the checks file cannot be parsed.

save

@telemetry_logger("save_checks", "installation")
def save(checks: list[dict], config: InstallationChecksStorageConfig) -> None

Save checks (dq rules) to yaml file or table in the installation folder. This will overwrite existing checks file or table.

Arguments:

  • checks - list of dq rules to save
  • config - configuration for saving checks, including the run configuration name, method, and table location.

VolumeFileChecksStorageHandler Objects

class VolumeFileChecksStorageHandler(
ChecksStorageHandler[VolumeFileChecksStorageConfig])

Handler for storing quality rules (checks) in a file (json or yaml) in a Unity Catalog volume.

load

@telemetry_logger("load_checks", "volume")
def load(config: VolumeFileChecksStorageConfig) -> list[dict]

Load checks (dq rules) from a file (json or yaml) in a Unity Catalog volume.

Arguments:

  • config - configuration for loading checks, including the file location and storage type.

Returns:

list of dq rules or raise an error if checks file is missing or is invalid.

Raises:

  • NotFound - if the checks file is not found in the workspace.
  • InvalidCheckError - if the checks file cannot be parsed.
  • CheckDownloadError - if there is an error downloading the file from the volume.

save

@telemetry_logger("save_checks", "volume")
def save(checks: list[dict], config: VolumeFileChecksStorageConfig) -> None

Save checks (dq rules) to yaml file in a Unity Catalog volume. This does not require installation of DQX in a Unity Catalog volume.

Arguments:

  • checks - list of dq rules to save
  • config - configuration for saving checks, including the file location and storage type.

BaseChecksStorageHandlerFactory Objects

class BaseChecksStorageHandlerFactory(ABC)

Abstract base class for factories that create storage handlers for checks.

create

@abstractmethod
def create(config: BaseChecksStorageConfig) -> ChecksStorageHandler

Abstract method to create a handler based on the type of the provided configuration object.

Arguments:

  • config - Configuration object for loading or saving checks.

Returns:

An instance of the corresponding BaseChecksStorageHandler.

create_for_location

@abstractmethod
def create_for_location(
location: str,
run_config_name: str = "default"
) -> tuple[ChecksStorageHandler, BaseChecksStorageConfig]

Abstract method to create a handler and config based on checks location.

Arguments:

  • location - location of the checks (file path, table name, volume, etc.).
  • run_config_name - the name of the run configuration to use for checks, e.g. input table or job name (use "default" if not provided).

Returns:

An instance of the corresponding BaseChecksStorageHandler.

create_for_run_config

@abstractmethod
def create_for_run_config(
run_config: RunConfig
) -> tuple[ChecksStorageHandler, BaseChecksStorageConfig]

Abstract method to create a handler and config based on a RunConfig.

This method inspects the RunConfig to determine the appropriate storage handler. If Lakebase connection parameters are present (lakebase_instance_name), it creates a LakebaseChecksStorageHandler. Otherwise, it delegates to create_for_location to infer the handler from the checks location string.

Arguments:

  • run_config - RunConfig containing checks location and optional Lakebase parameters.

Returns:

A tuple of (ChecksStorageHandler, BaseChecksStorageConfig).

ChecksStorageHandlerFactory Objects

class ChecksStorageHandlerFactory(BaseChecksStorageHandlerFactory)

create

def create(config: BaseChecksStorageConfig) -> ChecksStorageHandler

Factory method to create a handler based on the type of the provided configuration object.

Arguments:

  • config - Configuration object for loading or saving checks.

Returns:

An instance of the corresponding BaseChecksStorageHandler.

Raises:

  • InvalidConfigError - If the configuration type is unsupported.

create_for_run_config

def create_for_run_config(
run_config: RunConfig
) -> tuple[ChecksStorageHandler, BaseChecksStorageConfig]

Factory method to create a handler and config based on a RunConfig.

This method inspects the RunConfig to determine the appropriate storage handler. If Lakebase connection parameters are present (lakebase_instance_name), it creates a LakebaseChecksStorageHandler. Otherwise, it delegates to create_for_location to infer the handler from the checks location string.

Arguments:

  • run_config - RunConfig containing checks location and optional Lakebase parameters.

Returns:

A tuple of (ChecksStorageHandler, BaseChecksStorageConfig).

Raises:

  • InvalidConfigError - If the configuration is invalid or unsupported.

is_table_location

def is_table_location(location: str) -> bool

True if location points to a Delta table (catalog.schema.table) and is not a file path with a known checks serializer extension.

Arguments:

  • location str - The checks location to validate.

Returns:

  • bool - True if the location is a valid table name and not a file path, False otherwise.