Default Name : rawactions
Stores the raw actions captured through the ingestion of the _delta_log
json files for all tracked tables.
The schema has the columns matching the Actions from here(https://github.com/delta-io/delta/blob/master/core/src/main/scala/org/apache/spark/sql/delta/actions/actions.scala#L515)
and the Delta Log Protocol with
the following additional fields :
Column | Type | Description |
---|---|---|
file_name | String | Name of the Delta log transaction json file. Eg. - dbfs:/user/hive/warehouse/sample.db/table1/_delta_log/00000000000000000025.json |
path | String | Path to the Delta table. Eg. - dbfs:/user/hive/warehouse/sample.db/table1 |
puid | String | Path Unique Identifier (Partition column) |
commit_version | Long | Transaction Commit Version of the data. Eg. - 25 |
commit_ts | Timestamp | Transaction Commit Timestamp. Eg. - 2021-06-16T18:08:20.000+0000 |
update_ts | Timestamp | Last update timestamp |
commit_date | Date | Transaction Commit Date (Partition column) |
Default Name : commitinfosnapshots
DeltaOMS processes the raw actions from the Delta logs and creates a separate table for the Commit Information The schema matches the history schema with the following additional columns:
Column | Type | Description |
---|---|---|
file_name | String | Name of the Delta log transaction json file. Eg. - dbfs:/user/hive/warehouse/sample.db/table1/_delta_log/00000000000000000025.json |
path | String | Path to the Delta table. Eg. - dbfs:/user/hive/warehouse/sample.db/table1 |
puid | String | Path Unique Identifier (Partition column) |
commit_version | Long | Transaction Commit Version of the data. Eg. - 25 |
commit_ts | Timestamp | Transaction Commit Timestamp. Eg. - 2021-06-16T18:08:20.000+0000 |
update_ts | Timestamp | Last update timestamp |
commit_date | Date | Transaction Commit Date (Partition column) |
Default Name : actionsnapshots
DeltaOMS ingests all the actions from the tracked delta transaction logs. During processing, DeltaOMS extracts
the Add/Remove actions,
reconciles the AddFile and RemoveFile actions to build the snapshots at each commit_version
for a
table/path and populates the action snapshots table.
This table provides the ability to query file snapshots for any tracked delta path at certain point in time / commit version and get the data file details.
Column | Type | Description |
---|---|---|
data_path | String | Path to the Delta table. Eg. - dbfs:/user/hive/warehouse/sample.db/table1 |
puid | String | Path Unique Identifier (Partition column) |
commit_version | Long | Transaction Commit Version of the data. Eg. - 25 |
commit_ts | Timestamp | Transaction Commit Timestamp. Eg. - 2021-06-16T18:08:20.000+0000 |
commit_date | Date | Transaction Commit Date (Partition column) |
add_file | Struct | AddFile |