Analyzer Guide

Analyzer Insights

The Lakebridge Analyzer is built to scan and interpret metadata from ETL pipelines and SQL assets. Its analysis provides several key insights:

Job Complexity Assessment
Analyzer evaluates the complexity of your ETL and SQL jobs. These metrics are fed into the Conversion Calculator to help estimate both software licensing costs and the engineering hours required for the migration.
Comprehensive Job Inventory
It generates a full inventory of components such as mappings, programs, transformations, functions, and dynamic variables—giving you a clear picture of what exists in your legacy environment.
Cross-System Interdependency Mapping
Analyzer identifies interdependencies between jobs, systems, and components — surfacing how different parts of your codebase interact. This is crucial for sequencing migration efforts, minimizing risk, and avoiding disruption during cutover planning.

Verify Installation

Verify the successful installation by executing the provided command; confirmation of a successful installation is indicated when the displayed output aligns with the example screenshot provided:

Command:

 databricks labs lakebridge analyze --help

Should output:

Analyze existing non-Databricks database or ETL sources

Usage:
   databricks labs lakebridge analyze [flags]

Flags:
   -h, --help help for analyze
   --report-file path        (Optional) Local filesystem path of the analysis report file to write
   --source-directory path   (Optional) Local filesystem path of a directory containing sources to analyze
   --source-tech string      (Optional) The technology/platform of the sources to analyze

Global Flags:
 --debug            enable debug logging
 -o, --output type      output type: text or json (default text)
 -p, --profile string   ~/.databrickscfg profile
 -t, --target string    bundle target to use (if applicable)

Preparation Step

To prepare to use Analyzer the metadata needs to be extracted from the legacy system(s). This is done by exporting the metadata to a file system and ultimately making it available in a folder accessible by Analyzer. For a SQL database this is typically done by exporting the SQL files out of the database platform. For ETL/ELT solutions this is done by exporting their repository objects, this usually exports to some kind of XML or JSON format. For examples and instructions of the metadata export process, please refer to the page Exporting Legacy Metadata.

Execution Pre-Set Up

The analyzer takes a folder path containing source files to be scanned and the source technology type as inputs, then generates an Excel report with analysis results of all files and subfolders.

Below is the detailed explanation on the optional inputs for Analyzer (if you use inputs then all are required).

Source directory [Optional] - Absolute folder path containing the legacy artifacts.
Report file name [Optional] - Enter a custom report name or path if you want to save the analysis report in a different location from the source directory. IMPORTANT: Provide the full path + filename without extension - ensure the target directory exists beforehand.
Source technology [Optional] - Select the underlying technology platform of the source files being analyzed.

If no arguments are passed, Analyzer will prompt for the Source directory, Report file name and Source Technology.

If you plan to use arguments all will be required.

Execution

Execute the below command to initialize the analyze process.

 databricks labs lakebridge analyze [--source-directory <absolute-path>] [--report-file <absolute-path>] [--source-tech <string>]

Supported dialects

Source Platform	Source Platform	Source Platform
ABInitio	SAS	Redshift
ADF	MS SQL Server	Snowflake
Alteryx	Netezza	SPSS
Athena	Oozie	SQOOP
BigQuery	Oracle	SSIS
Cloudera (Impala)	Oracle Data Integrator	SSRS
Datastage	PentahoDI	Synapse
Greenplum	PIG	Talend
Hive	Presto	Teradata
IBM DB2	PySpark	Vertica
SAPHANA - CalcViews

Analyzer Insights​

Verify Installation​

Preparation Step​

Execution Pre-Set Up​

If no arguments are passed, Analyzer will prompt for the Source directory, Report file name and Source Technology.​

If you plan to use arguments all will be required.​

Execution​

Supported dialects​