DLT-META CLI
Prerequisites:
- Python 3.8.0 +
- Databricks CLI
Steps:
Install and authenticate Databricks CLI:
databricks auth login --host WORKSPACE_HOST
Install dlt-meta via Databricks CLI:
databricks labs install dlt-meta
Clone dlt-meta repository:
git clone https://github.com/databrickslabs/dlt-meta.git
Navigate to project directory:
cd dlt-meta
Create Python virtual environment:
python -m venv .venv
Activate virtual environment:
source .venv/bin/activate
Install required packages:
# Core requirements pip install "PyYAML>=6.0" setuptools databricks-sdk # Development requirements pip install flake8==6.0 delta-spark==3.0.0 pytest>=7.0.0 coverage>=7.0.0 pyspark==3.5.5 # Integration test requirements pip install "typer[all]==0.6.1"
Set environment variables:
dlt_meta_home=$(pwd) export PYTHONPATH=$dlt_meta_home
OnboardJob
Run Onboarding using dlt-meta cli command:
databricks labs dlt-meta onboard
- The command will prompt you to provide onboarding details. If you have cloned the dlt-meta repository, you can accept the default values which will use the configuration from the demo folder.
Above onboard cli command will:
- Push code and data to your Databricks workspace
- Create an onboarding job
- Display a success message:
Job created successfully. job_id={job_id}, url=https://{databricks workspace url}/jobs/{job_id}
- Job URL will automatically open in your default browser.
Once onboarding jobs is finished deploy
bronze
andsilver
Lakeflow Declarative Pipeline using below command
DLT-META Lakeflow Declarative Pipeline:
Deploy Bronze
and Silver
layer into single pipeline
databricks labs dlt-meta deploy
- Above command will prompt you to provide pipeline details. Please provide respective details for schema which you provided in above steps
- Above deploy cli command will:
- Deploy Lakeflow Declarative pipeline with dlt-meta configuration like
layer
,group
,dataflowSpec table details
etc to your databricks workspace - Display message:
dlt-meta pipeline={pipeline_id} created and launched with update_id={pipeline_update_id}, url=https://{databricks workspace url}/#joblist/pipelines/{pipeline_id}
- Pipline URL will automatically open in your defaul browser.
- Deploy Lakeflow Declarative pipeline with dlt-meta configuration like