Skip to main content

Analyzer Guide

Analyzer Insights

The Lakebridge Analyzer is built to scan and interpret metadata from ETL pipelines and SQL assets. Its analysis provides several key insights:

  • Job Complexity Assessment
    Analyzer evaluates the complexity of your ETL and SQL jobs. These metrics are fed into the Conversion Calculator to help estimate both software licensing costs and the engineering hours required for the migration.

  • Comprehensive Job Inventory
    It generates a full inventory of components such as mappings, programs, transformations, functions, and dynamic variables—giving you a clear picture of what exists in your legacy environment.

  • Cross-System Interdependency Mapping
    Analyzer identifies interdependencies between jobs, systems, and components — surfacing how different parts of your codebase interact. This is crucial for sequencing migration efforts, minimizing risk, and avoiding disruption during cutover planning.

Verify Installation

Verify the successful installation by executing the provided command; confirmation of a successful installation is indicated when the displayed output aligns with the example screenshot provided:

 databricks labs lakebridge analyze --help
transpile-help

Preparation Step

To prepare to use Analyzer the metadata needs to be extracted from the legacy system(s). This is done by exporting the metadata to a file system and ultimately making it available in a folder accessible by Analyzer. For a SQL database this is typically done by exporting the SQL files out of the database platform. For ETL/ELT solutions this is done by exporting their repository objects, this usually exports to some kind of XML or JSON format. For examples and instructions of the metadata export process, please refer to the page Exporting Legacy Metadata.

Execution Pre-Set Up

The analyzer takes a folder path containing source files to be scanned and the source technology type as inputs, then generates an Excel report with analysis results of all files and subfolders.

Below is the detailed explanation on the optional inputs for Analyzer (if you use inputs then all are required).

  • Source directory [Optional] - Absolute folder path containing the legacy artifacts.
  • Report file name [Optional] - Enter a custom report name or path if you want to save the analysis report in a different location from the source directory. IMPORTANT: Provide the full path + filename without extension - ensure the target directory exists beforehand.
  • Source technology [Optional] - Select the underlying technology platform of the source files being analyzed.

If no arguments are passed, Analyzer will prompt for the Source directory, Report file name and Source Technology.

If you plan to use arguments all will be required.

Execution

Execute the below command to initialize the analyze process.

 databricks labs lakebridge analyze [--source-directory <absolute-path>] [--report-file <absolute-path>] [--source-tech <string>]
transpile-run

Supported dialects

Source PlatformSource PlatformSource Platform
ABInitioInformatica CloudSAS
ADFMS SQL ServerSnowflake
AlteryxNetezzaSPSS
AthenaOozieSQOOP
BigQueryOracleSSIS
Cloudera (Impala)Oracle Data IntegratorSSRS
DatastagePentahoDISynapse
GreenplumPIGTalend
HivePrestoTeradata
IBM DB2PySparkVertica
Informatica - Big Data EditionRedshift
Informatica - PCSAPHANA - CalcViews