Skip to main content
DQX Logo

DQX - Data Quality Framework

Provided by Databricks Labs

DQX is a data quality framework for Apache Spark that enables you to define, monitor, and react to data quality issues in your data pipelines.

Capabilities

Info of Failed Checks

Get detailed insights into why a check has failed.

Data Format Agnostic

Works seamlessly with Spark DataFrames.

Spark Batch & Streaming Support

Includes Delta Live Tables (DLT) integration.

Custom Reactions to Failed Checks

Drop, mark, or quarantine invalid data flexibly.

Check Levels

Use warning or error levels for failed checks.

Row & Column Level Rules

Define quality rules at both row and column levels.

Profiling & Rule Generation

Automatically profile and generate data quality rule candidates.

Code or Config Checks

Define checks as code or configuration.

Validation Summary & Dashboard

Track and identify data quality issues effectively.

Improve your data quality now 🚀

Follow our comprehensive guide to get up and running with DQX in no time.