Q. How do I get started ?
Please refer to the Getting Started guide
Q. How do I add databases and catalogs to be monitored by DeltaOMS ?
You can add a database name to the DeltaOMS configuration table (by default called sourceconfig
)
using simple SQL INSERT
statement.
Example:
INSERT INTO <omsCatalogName>.<omsDBName>.sourceconfig VALUES('<Database Name>',false)
You can add a single catalog or a pattern of catalogs to be monitored by DeltaOMS
INSERT INTO <omsCatalogName>.<omsDBName>.sourceconfig VALUES('demo',false)
INSERT INTO <omsCatalogName>.<omsDBName>.sourceconfig VALUES('bu1*|bu2*',false)
For more details on the configurations and parameters, refer to Getting Started and Developer Guide
Q. What components will to be deployed for DeltaOMS ?
DeltaOMS deploys two primary components. One for ingestion of the delta logs from the configured table. This is a streaming component and can be run either on a schedule or as an always running streaming job depending on your SLA requirements for retrieving operational metrics.
The other component is a batch component for processing and enriching of the delta actions into different OMS specific tables.
Q. How many ingestion jobs we need to run ?
DeltaOMS supports 50 streams by default for each Databricks jobs. These values are configurable
through command line parameters --startingStream
and --endingStream
, default 1 and 50 respectively.
We recommend setting up your jobs to support groups of 40-50 stream / wildcard paths. For example, you have 75 unique wildcard paths to process, we recommend creating 2 Databricks jobs for DeltaOMS ingestion.
Q. What is the process flow for adding new input sources for DeltaOMS tracking ? Assuming you already have DeltaOMS running on your environment,new input sources can be added by:
sourceconfig
table. Example usage in Developer Guidecom.databricks.labs.deltaoms.init.ConfigurePaths
Q. (Advanced) Is there an option to add all Delta tables under a path using wildcard expressions to be tracked by DeltaOMS ?
You could use a special wildcard expression [PATH/**
] to the sourceconfig
table and add all Delta tables under a path for DeltaOMS tracking.
An example syntax to add such a path is :
INSERT INTO <omsDBName>.sourceconfig VALUES('dbfs:/user/warehouse/**',false, Map('wildCardLevel','0'))
DeltaOMS will discover all delta tables under the path 'dbfs:/user/warehouse/
and add it to the internal pathconfig
table for tracking.