Column | Type | IsRequired | Description |
---|---|---|---|
workspace_name | String | True | Name of the workspace. |
workspace_id | String | True | Id of the workspace. MUST BE VALUE AFTER THE o= in the URL bar. To ensure you get the right value, run the following on the target workspace. Initializer.getOrgId |
workspace_url | String | True | URL of the workspace. Should be in format of https://*.com or https://*.net. Don’t include anything after the .com or .net suffix |
api_url | String | True | API URL for the Workspace (execute in scala dbutils.notebook.getContext().apiUrl.get ON THE TARGET WORKSPACE NOT DEPLOYMENT WORKSPACE to get the API URL for the workspace. NOTE: Workspace_URL and API_URL can be different for a workspace but may be the same even for multiple workspaces). You can also use the workspace_url here. |
cloud | String | True | Cloud provider (Azure/AWS/GCP). |
primordial_date | String | True | The date from which Overwatch will capture the details. The format should be yyyy-MM-dd ex: 2022-05-20 == May 20 2022. **IMPORTANT NOTE: ** You should only set the primordial date in the initial run of Overwatch, and never change it again, as Overwatch will progress the dates using it’s own calculations and checkpoints. |
storage_prefix | String | True | CASE SENSITIVE - Lower Case The location in which Overwatch will store the data. You can think of this as the Overwatch working directory. dbfs:/mnt/path/… or abfss://container@myStorageAccount.dfs.core.windows.net/… or s3://myBucket/… or gs://myBucket/… |
etl_database_name | String | True | The name of the ETL data base for Overwatch (i.e. overwatch_etl or custom) |
consumer_database_name | String | True | The name of the Consumer database for Overwatch. (i.e. overwatch or custom) |
secret_scope | String | True | Name of the secret scope. This must be created on the workspace which the Overwatch job will execute. |
secret_key_dbpat | String | True | This will contain the PAT token of the workspace. The key should be present in the secret_scope and should start with the letters dapi. |
auditlogprefix_source_path | String | True | For all clouds use keyword system to fetch data from System Tables (system.access.audit) See System Table Configuration Details for details. If you are not using System Tables, you can enter the location of the auditlogs (AWS/GCP Only). The contents under this directory must have the folders with the date partitions like date=2022-12-0 . |
interactive_dbu_price | Double | True | Contract (or list) Price for interactive DBUs. The provided template has the list prices by default. |
automated_dbu_price | Double | True | Contract (or list) Price for automated DBUs. The provided template has the list prices by default. |
sql_compute_dbu_price | Double | True | Contract (or list) Price for DBSQL DBUs. This should be the closest average price across your DBSQL Skus (classic / Pro / Serverless) for now. See Custom Costs for more details. The provided template has the DBSQL Classic list prices by default. |
jobs_light_dbu_price | Double | True | Contract (or list) Price for interactive DBUs. The provided template has the list prices by default. |
max_days | Integer | True | This is the max incrementals days that will be loaded. Usually only relevant for historical loading and rebuilds. Recommendation == 30 |
excluded_scopes | String | False | Scopes that should not be excluded from the pipelines. Since this is a CSV, it’s critical that these are colon delimited. Leave blank if you’d like to load all overwatch scopes. |
active | Boolean | True | Whether or not the workspace should be validated / deployed. |
proxy_host | String | False | Proxy url for the workspace. |
proxy_port | String | False | Proxy port for the workspace |
proxy_user_name | String | False | Proxy user name for the workspace. |
proxy_password_scope | String | False | Scope which contains the proxy password key. |
proxy_password_key | String | False | Key which contains proxy password. |
success_batch_size | Integer | False | API Tunable - Indicates the size of the buffer on filling of which the result will be written to a temp location. This is used to tune performance in certain circumstances. Leave default except for special circumstances. Default == 200 |
error_batch_size | Integer | False | API Tunable - Indicates the size of the error writer buffer containing API call errors. This is used to tune performance in certain circumstances. Leave default except for special circumstances. Default == 500 |
enable_unsafe_SSL | Boolean | False | API Tunable - Enables unsafe SSL. Default == False |
thread_pool_size | Integer | False | API Tunable - Max number of API calls Overwatch is allowed to make in parallel. Default == 4. Increase for faster bronze but if workspace is busy, risks API endpoint saturation. Overwatch will detect saturation and back-off when detected but for safety never go over 8 without testing. |
api_waiting_time | Long | False | API Tunable - Overwatch makes async api calls in parallel, api_waiting_time signifies the max wait time in case of no response received from the api call. Default = 300000(5 minutes) |
mount_mapping_path | String | False | Path to local CSV holding details of all mounts on remote workspaces (only necessary for remote workspaces with >50 mounts) click here for more details |
temp_dir_path | String | False | Custom temporary working directory, directory gets cleaned up after each run. |
When configuring the Azure EH configurations users can use EITHER a shared access key OR AAD SP as of 072x to authenticate to the EH. Below are the required configurations for each auth method. One of the options for Azure deployments must be used as EH is required for Azure.
Shared Access Key Requirements Review Authorizing Access Via SAS Policy for more details.
Column | Type | IsRequired | Description |
---|---|---|---|
eh_name | String | True (AZURE) | Event hub name (Azure Only) The event hub will contain the audit logs of the workspace |
eh_scope_key | String | True (AZURE) | Name of the key in the <secret_scope> that holds the connection string to the Event Hub WITH THE SHARED ACCESS KEY IN IT – See EH Configuration for details |
AAD Requirements
Review Authorizing Access Via AAD SPN for more details.
Ensure the dependent library for AAD Auth is attached com.microsoft.azure:msal4j:1.10.1
Column | Type | IsRequired | Description |
---|---|---|---|
eh_name | String | True (AZURE) | Event hub name The event hub will contain the audit logs of the workspace |
eh_conn_string | String | True (AZURE) | Event hub connection string without shared access key. ex: “Endpoint=sb://evhub-ns.servicebus.windows.net” |
aad_tenant_id | String | True (AZURE) | Tenant ID for Service principle. |
aad_client_id | String | True (AZURE) | Client ID for Service principle. |
aad_client_secret_key | String | True (AZURE) | Name of the key in the <secret_scope> that holds the SPN secret for the Service principle. |
aad_authority_endpoint | String | True (AZURE) | Endpoint of the authority. Default value is “https://login.microsoftonline.com/" |
Column | Type | IsRequired | Description |
---|---|---|---|
workspace_name | String | True | Name of the workspace. |
workspace_id | String | True | Id of the workspace. |
workspace_url | String | True | URL of the workspace. |
api_url | String | True | API URL for the Workspace (execute in scala dbutils.notebook.getContext().apiUrl.get ON THE TARGET WORKSPACE NOT DEPLOYMENT WORKSPACE to get the API URL for the workspace. NOTE: Workspace_URL and API_URL can be different for a workspace but may be the same even for multiple workspaces). |
cloud | String | True | Cloud provider (Azure or AWS). |
primordial_date | String | True | The date from which Overwatch will capture the details. The format should be yyyy-MM-dd ex: 2022-05-20 == May 20 2022 |
storage_prefix | String | True | The location on which Overwatch will store the data. You can think of this as the Overwatch working directory. dbfs:/mnt/path/… or abfss://container@myStorageAccount.dfs.core.windows.net/… or s3://myBucket/… or gs://myBucket/… |
etl_database_name | String | True | The name of the ETL data base for Overwatch (i.e. overwatch_etl or custom) |
consumer_database_name | String | True | The name of the Consumer database for Overwatch. (i.e. overwatch or custom) |
secret_scope | String | True | Name of the secret scope. This must be created on the workspace which the Overwatch job will execute. |
secret_key_dbpat | String | True | This will contain the PAT token of the workspace. The key should be present in the secret_scope and should start with dapi. |
auditlogprefix_source_path | String | True (AWS/GCP) | Location of auditlog (AWS/GCP Only). The contents under this directory must have the folders with the date partitions like date=2022-12-01 |
interactive_dbu_price | Double | True | Contract (or list) Price for interactive DBUs. The provided template has the list prices by default. |
automated_dbu_price | Double | True | Contract (or list) Price for automated DBUs. The provided template has the list prices by default. |
sql_compute_dbu_price | Double | True | Contract (or list) Price for DBSQL DBUs. This should be the closest average price across your DBSQL Skus (classic / Pro / Serverless) for now. See Custom Costs for more details. The provided template has the DBSQL Classic list prices by default. |
jobs_light_dbu_price | Double | True | Contract (or list) Price for interactive DBUs. The provided template has the list prices by default. |
max_days | Integer | True | This is the max incrementals days that will be loaded. Usually only relevant for historical loading and rebuilds. Recommendation == 30 |
excluded_scopes | String | False | Scopes that should not be excluded from the pipelines. Since this is a CSV, it’s critical that these are colon delimited. Leave blank if you’d like to load all overwatch scopes. |
active | Boolean | True | Whether or not the workspace should be validated / deployed. |
proxy_host | String | False | Proxy url for the workspace. |
proxy_port | String | False | Proxy port for the workspace |
proxy_user_name | String | False | Proxy user name for the workspace. |
proxy_password_scope | String | False | Scope which contains the proxy password key. |
proxy_password_key | String | False | Key which contains proxy password. |
success_batch_size | Integer | False | API Tunable - Indicates the size of the buffer on filling of which the result will be written to a temp location. This is used to tune performance in certain circumstances. Leave default except for special circumstances. Default == 200 |
error_batch_size | Integer | False | API Tunable - Indicates the size of the error writer buffer containing API call errors. This is used to tune performance in certain circumstances. Leave default except for special circumstances. Default == 500 |
enable_unsafe_SSL | Boolean | False | API Tunable - Enables unsafe SSL. Default == False |
thread_pool_size | Integer | False | API Tunable - Max number of API calls Overwatch is allowed to make in parallel. Default == 4. Increase for faster bronze but if workspace is busy, risks API endpoint saturation. Overwatch will detect saturation and back-off when detected but for safety never go over 8 without testing. |
api_waiting_time | Long | False | API Tunable - Overwatch makes async api calls in parallel, api_waiting_time signifies the max wait time in case of no response received from the api call. Default = 300000(5 minutes) |
mount_mapping_path | String | False | Path to local CSV holding details of all mounts on remote workspaces (only necessary for remote workspaces with >50 mounts) click here for more details |
temp_dir_path | String | False | Custom temporary working directory, directory gets cleaned up after each run. |
When configuring the Azure EH configurations users can use EITHER a shared access key OR AAD SP as of 072x to authenticate to the EH. Below are the required configurations for each auth method. One of the options for Azure deployments must be used as EH is required for Azure.
Shared Access Key Requirements
Column | Type | IsRequired | Description |
---|---|---|---|
eh_name | String | True (AZURE) | Event hub name (Azure Only) The event hub will contain the audit logs of the workspace |
eh_scope_key | String | True (AZURE) | Name of the key in the <secret_scope> that holds the connection string to the Event Hub WITH THE SHARED ACCESS KEY IN IT – See EH Configuration for details |
AAD Requirements
Review Authorizing Access Via AAD SPN for more details.
Ensure the dependent library for AAD Auth is attached com.microsoft.azure:msal4j:1.10.1
Column | Type | IsRequired | Description |
---|---|---|---|
eh_name | String | True (AZURE) | Event hub name The event hub will contain the audit logs of the workspace |
eh_conn_string | String | True (AZURE) | Event hub connection string without shared access key. ex: “Endpoint=sb://evhub-ns.servicebus.windows.net” |
aad_tenant_id | String | True (AZURE) | Tenant ID for Service principle. |
aad_client_id | String | True (AZURE) | Client ID for Service principle. |
aad_client_secret_key | String | True (AZURE) | Client Secret Key for Service principle. |
aad_authority_endpoint | String | True (AZURE) | Endpoint of the authority. Default value is “https://login.microsoftonline.com/" |
Column | Type | IsRequired | Description |
---|---|---|---|
workspace_name | String | True | Name of the workspace. |
workspace_id | String | True | Id of the workspace. |
workspace_url | String | True | URL of the workspace. |
api_url | String | True | API URL for the Workspace (execute in scala dbutils.notebook.getContext().apiUrl.get ON THE TARGET WORKSPACE NOT DEPLOYMENT WORKSPACE to get the API URL for the workspace. NOTE: Workspace_URL and API_URL can be different for a workspace but may be the same even for multiple workspaces). |
cloud | String | True | Cloud provider (Azure or AWS). |
primordial_date | String | True | The date from which Overwatch will capture the details. The format should be yyyy-MM-dd ex: 2022-05-20 == May 20 2022 |
etl_storage_prefix | String | True | The location on which Overwatch will store the data. You can think of this as the Overwatch working directory. dbfs:/mnt/path/… or abfss://container@myStorageAccount.dfs.core.windows.net/… or s3://myBucket/… or gs://myBucket/… |
etl_database_name | String | True | The name of the ETL data base for Overwatch (i.e. overwatch_etl or custom) |
consumer_database_name | String | True | The name of the Consumer database for Overwatch. (i.e. overwatch or custom) |
secret_scope | String | True | Name of the secret scope. This must be created on the workspace which the Overwatch job will execute. |
secret_key_dbpat | String | True | This will contain the PAT token of the workspace. The key should be present in the secret_scope and should start with dapi. |
auditlogprefix_source_aws | String | True (AWS/GCP) | Location of auditlog (AWS/GCP Only). The contents under this directory must have the folders with the date partitions like date=2022-12-01 |
eh_name | String | True (AZURE) | Event hub name (Azure Only) The event hub will contain the audit logs of the workspace |
eh_scope_key | String | True for NON AAD Connection(AZURE) | (Azure Only) Key that holds the connection string to the Event Hub – See EH Configuration for details |
interactive_dbu_price | Double | True | Contract (or list) Price for interactive DBUs. The provided template has the list prices by default. |
automated_dbu_price | Double | True | Contract (or list) Price for automated DBUs. The provided template has the list prices by default. |
sql_compute_dbu_price | Double | True | Contract (or list) Price for DBSQL DBUs. This should be the closest average price across your DBSQL Skus (classic / Pro / Serverless) for now. See Custom Costs for more details. The provided template has the DBSQL Classic list prices by default. |
jobs_light_dbu_price | Double | True | Contract (or list) Price for interactive DBUs. The provided template has the list prices by default. |
max_days | Integer | True | This is the max incrementals days that will be loaded. Usually only relevant for historical loading and rebuilds. Recommendation == 30 |
excluded_scopes | String | False | Scopes that should not be excluded from the pipelines. Since this is a CSV, it’s critical that these are colon delimited. Leave blank if you’d like to load all overwatch scopes. |
active | Boolean | True | Whether or not the workspace should be validated / deployed. |
proxy_host | String | False | Proxy url for the workspace. |
proxy_port | String | False | Proxy port for the workspace |
proxy_user_name | String | False | Proxy user name for the workspace. |
proxy_password_scope | String | False | Scope which contains the proxy password key. |
proxy_password_key | String | False | Key which contains proxy password. |
success_batch_size | Integer | False | API Tunable - Indicates the size of the buffer on filling of which the result will be written to a temp location. This is used to tune performance in certain circumstances. Leave default except for special circumstances. Default == 200 |
error_batch_size | Integer | False | API Tunable - Indicates the size of the error writer buffer containing API call errors. This is used to tune performance in certain circumstances. Leave default except for special circumstances. Default == 500 |
enable_unsafe_SSL | Boolean | False | API Tunable - Enables unsafe SSL. Default == False |
thread_pool_size | Integer | False | API Tunable - Max number of API calls Overwatch is allowed to make in parallel. Default == 4. Increase for faster bronze but if workspace is busy, risks API endpoint saturation. Overwatch will detect saturation and back-off when detected but for safety never go over 8 without testing. |
api_waiting_time | Long | False | API Tunable - Overwatch makes async api calls in parallel, api_waiting_time signifies the max wait time in case of no response received from the api call. Default = 300000(5 minutes) |
mount_mapping_path | String | False | Path to local CSV holding details of all mounts on remote workspaces (only necessary for remote workspaces with >50 mounts) click here for more details |
temp_dir_path | String | False | Custom temporary working directory, directory gets cleaned up after each run. |