data_flow_id |
This is unique identifer for pipeline |
data_flow_group |
This is group identifer for launching multiple pipelines under single DLT |
source_format |
Source format e.g cloudFiles , eventhub , kafka , delta |
source_details |
This map Type captures all source details for cloudfiles = source_schema_path , source_path_{env} , source_database and for eventhub= source_schema_path , eventhub.accessKeyName , eventhub.accessKeySecretName , eventhub.name , eventhub.secretsScopeName , kafka.sasl.mechanism , kafka.security.protocol , eventhub.namespace , eventhub.port . For Source schema file spark DDL schema format parsing is supported In case of custom schema format then write schema parsing function bronze_schema_mapper(schema_file_path, spark):Schema and provide to OnboardDataflowspec initialization .e.g onboardDataFlowSpecs = OnboardDataflowspec(spark, dict_obj,bronze_schema_mapper).onboardDataFlowSpecs() |
bronze_database_{env} |
Delta lake bronze database name. |
bronze_table |
Delta lake bronze table name |
bronze_reader_options |
Reader options which can be provided to spark reader e.g multiline=true,header=true in json format |
bronze_parition_columns |
Bronze table partition cols list |
bronze_cdc_apply_changes |
Bronze cdc apply changes Json |
bronze_table_path_{env} |
Bronze table storage path. |
bronze_table_properties |
DLT table properties map. e.g. {"pipelines.autoOptimize.managed": "false" , "pipelines.autoOptimize.zOrderCols": "year,month", "pipelines.reset.allowed": "false" } |
bronze_data_quality_expectations_json |
Bronze table data quality expectations |
bronze_database_quarantine_{env} |
Bronze database for quarantine data which fails expectations. |
bronze_quarantine_table Bronze |
Table for quarantine data which fails expectations |
bronze_quarantine_table_path_{env} |
Bronze database for quarantine data which fails expectations. |
bronze_quarantine_table_partitions |
Bronze quarantine tables partition cols |
bronze_quarantine_table_properties |
DLT table properties map. e.g. {"pipelines.autoOptimize.managed": "false" , "pipelines.autoOptimize.zOrderCols": "year,month", "pipelines.reset.allowed": "false" } |
silver_database_{env} |
Silver database name. |
silver_table |
Silver table name |
silver_partition_columns |
Silver table partition columns list |
silver_cdc_apply_changes |
Silver cdc apply changes Json |
silver_table_path_{env} |
Silver table storage path. |
silver_table_properties |
DLT table properties map. e.g. {"pipelines.autoOptimize.managed": "false" , "pipelines.autoOptimize.zOrderCols": "year,month", "pipelines.reset.allowed": "false"} |
silver_transformation_json |
Silver table sql transformation json path |
silver_data_quality_expectations_json_{env} |
Silver table data quality expectations json file path |