databricks.labs.dqx.io
read_input_data
def read_input_data(spark: SparkSession,
input_config: InputConfig) -> DataFrame
Reads input data from the specified location and format.
Arguments:
spark
- SparkSessioninput_config
- InputConfig with source location/table name, format, and options
Returns:
DataFrame with values read from the input data
save_dataframe_as_table
def save_dataframe_as_table(df: DataFrame, output_config: OutputConfig)
Helper method to save a DataFrame to a Delta table.
Arguments:
df
- The DataFrame to saveoutput_config
- Output table name, write mode, and options
get_reference_dataframes
def get_reference_dataframes(
spark: SparkSession,
reference_tables: dict[str, InputConfig] | None = None
) -> dict[str, DataFrame] | None
Get reference DataFrames from the provided reference tables configuration.
Arguments:
spark
- SparkSessionreference_tables
- A dictionary mapping of reference table names to their input configurations.
Examples:
reference_tables = {
"reference_table_1": InputConfig(location="db.schema.table1", format="delta"),
"reference_table_2": InputConfig(location="db.schema.table2", format="delta")
}
Returns:
A dictionary mapping reference table names to their DataFrames.