Inventory
List of all UCX objects and their respective metadata.
Overview
Table utilization per workflow:
| Table | Generate Assessment | Update Migration Progress | Migrate Groups | Migrate External Tables | Upgrade Jobs | Migrate tables | Migrate Data Reconciliation |
|---|---|---|---|---|---|---|---|
| tables | RW | RW | RW | RW | |||
| grants | RW | RW | RW | RW | |||
| mounts | RW | RO | RO | RO | |||
| permissions | RW | RW | RO | RO | |||
| jobs | RW | RW | RO | ||||
| clusters | RW | RW | |||||
| directfs_in_paths | RW | RW | |||||
| directfs_in_queries | RW | RW | |||||
| external_locations | RW | RO | |||||
| workspace | RW | RO | RO | ||||
| workspace_objects | RW | ||||||
| azure_service_principals | RW | ||||||
| global_init_scripts | RW | ||||||
| pipelines | RW | RW | |||||
| groups | RW | RO | |||||
| table_size | RW | ||||||
| submit_runs | RW | ||||||
| policies | RW | RW | |||||
| migration_status | RW | RW | RW | ||||
| query_problems | RW | RW | |||||
| workflow_problems | RW | RW | |||||
| udfs | RW | RW | RO | ||||
| logs | RW | RW | RW | RW | RW | RW | RW |
| recon_results | RW | ||||||
| redash_dashboards | RW | RW | |||||
| lakeview_dashboards | RW | RW |
- RW - Read/Write, the job generates or updates the table.
- RO - Read Only
Inventory Database
_$inventory_.tables
Holds Inventory of all tables in all databases and their relevant metadata.
| Column | Datatype | Description | Comments |
|---|---|---|---|
| catalog | string | Original catalog of the table. hive_metastore by default | |
| database | string | Original schema of the table | |
| name | string | Name of the table | |
| object_type | string | MANAGED, EXTERNAL, or VIEW | |
| table_format | string | Table provider. Like delta or json or parquet. | |
| location | string | Location of the data for table | |
| view_text | nullable string | If the table is the view, then this column holds the definition of the view | |
| upgraded_to | string | Upgrade Target (3 level namespace) |
_$inventory_.grants
Inventory of all Table ACLs for tables indexed in tables table.
| Column | Datatype | Description | Comments |
|---|---|---|---|
| principal | string | User name, group name, or service principal name | |
| action_type | string | Name of GRANT action | |
| catalog | string | Original catalog of the table. hive_metastore by default | |
| database | Nullable string | Original schema of the table | |
| table | Nullable string | Name of the table | |
| view | Nullable string | Name of the view | |
| any_file | bool | Any file | |
| anonymous_function | string | Grant for the anonymous function |
_$inventory_.mounts
List of DBFS mount points.
| Column | Datatype | Description | Comments |
|---|---|---|---|
| name | string | Name of the mount point | |
| source | string | Location of the backing dataset | |
| instance_profile | Nullable string | This mount point is accessible only with this AWS IAM instance profile |
_$inventory_.permissions
Workspace object level permissions
| Column | Datatype | Description | Comments |
|---|---|---|---|
| object_id | string | Either: Group ID Workspace Object ID Redash Object ID Scope name | |
| supports | string | One of: AUTHORIZATION CLUSTERS CLUSTER_POLICIES DIRECTORIES EXPERIMENTS FILES INSTANCE_POOLS JOBS NOTEBOOKS PIPELINES REGISTERED_MODELS REPOS SERVING_ENDPOINTS SQL_WAREHOUSES | |
| raw_object_permissions | JSON | JSON-serialized response of: Generic Permissions Secret ACL Group roles and entitlements Redash permissions |
_$inventory_.jobs
Holds a list of all jobs with a notation of potential issues.
| Column | Datatype | Description | Comments |
|---|---|---|---|
| job_id | string | Job ID | |
| job_name | string | Job Name | |
| job_creator | string | UserID of the Job Creator | |
| compatible | int | 1 or 0, used for percentage reporting | |
| failures | string | List of issues identified by the assessment in JSON format |
_$inventory_.clusters
Holds a list of all clusters with a notation of potential issues.
| Column | Datatype | Description | Comments |
|---|---|---|---|
| cluster_id | string | Cluster Id | |
| cluster_name | string | Cluster Name | |
| cluster_creator | string | UserID of the Cluster Creator | |
| compatible | int | 1 or 0, used for percentage reporting | |
| failures | string | List of issues identified by the assessment in JSON format |
_$inventory_.external_locations
Holds a list of all external locations that will be required for the migration.
| Column | Datatype | Description | Comments |
|---|---|---|---|
| external_location | string | External Location URL |
_$inventory_.workspace_objects
Holds a list of all workspace objects (notebooks, directories, files, repos and libraries) from the workspace. This is used by the permission crawler.
| Column | Datatype | Description | Comments |
|---|---|---|---|
| object_id | string | unique id of the object | |
| object_type | string | type of object (NOTEBOOK, DIRECTORY, REPO, FILE, LIBRARY) | |
| path | string | full path of the object in the workspace | |
| language | string | language of the object (applicable for notebooks only) |
_$inventory_.redash_dashboards and _$inventory_.lakeview_dashboards
Holds a list of all Redash or Lakeview dashboards. This is used by the QueryLinter and Redash migration.
| Column | Datatype | Description | Comments |
|---|---|---|---|
| id | string | The ID for this dashboard. | |
| name | string | The title of the dashboard that appears in list views and at the top of the dashboard page. | |
| parent | string | The identifier of the workspace folder containing the object. | |
| query_ids | list[string] | The IDs of the queries referenced by this dashboard. | |
| tags | list[string] | The tags set on this dashboard. |