Installation
Databricks Labs Remorph

Table of Contents
Remorph
Remorph stands as a comprehensive toolkit meticulously crafted to facilitate seamless migrations to Databricks. This suite of tools is dedicated to simplifying and optimizing the entire migration process, offering two distinctive functionalities – Transpile and Reconcile. Whether you are navigating code translation or resolving potential conflicts, Remorph ensures a smooth journey for any migration project. With Remorph as your trusted ally, the migration experience becomes not only efficient but also well-managed, setting the stage for a successful transition to the Databricks platform.
Transpile
Transpile is a self-contained SQL parser, transpiler, and validator designed to interpret a diverse range of SQL inputs and generate syntactically and semantically correct SQL in the Databricks SQL dialect. This tool serves as an automated solution, named Transpile, specifically crafted for migrating and translating SQL scripts from various sources to the Databricks SQL format. Currently, it exclusively supports Snowflake as a source platform, leveraging the open-source SQLglot.
Transpile stands out as a comprehensive and versatile SQL transpiler, boasting a robust test suite to ensure reliability. Developed entirely in Python, it not only demonstrates high performance but also highlights syntax errors and provides warnings or raises alerts for dialect incompatibilities based on configurations.
Transpiler Design Flow:
Reconcile
Reconcile is an automated tool designed to streamline the reconciliation process between source data and target data residing on Databricks. Currently, the platform exclusively offers support for Snowflake, Oracle and other Databricks tables as the primary data source. This tool empowers users to efficiently identify discrepancies and variations in data when comparing the source with the Databricks target.
Environment Setup
Pre-requisites
Databricks CLI
- Ensure that you have the Databricks Command-Line Interface (CLI) installed on your machine. Refer to the installation instructions provided for Linux, MacOS, and Windows, available here.
- Install Databricks CLI on Linux without brew
#!/usr/bin/env bash
#install dependencies
apt update && apt install -y curl sudo unzip
#install databricks cli
curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/v0.242.0/install.sh | sudo sh
Databricks CLI
- Configure the Databricks CLI by executing the following command with appropriate host and cluster details:profile_name
is optional, if not provided, theDEFAULT
profile will be used.
databricks configure --host <host> --configure-cluster --profile <profile_name>
The Flag --configure-cluster
gives you the prompt to select the cluster_id from the available clusters on your workspace.
Alternatively you can use the environment variable DATABRICKS_CLUSTER_ID
to set the cluster id you would want to use
for your profile before running the databricks configure
command.
export DATABRICKS_CLUSTER_ID=<cluster_id>
databricks configure --host <host> --profile <profile_name>
Python
- Verify that your machine has Python version 3.10 or later installed to meet the required dependencies for seamless operation.
Windows
- Install python from here. Your Windows computer will need a shell environment (GitBash or WSL)MacOS/Unix
- Use brew to install python in macOS/Unix machines
Installing Databricks CLI on macOS

Install Databricks CLI via curl on Windows

Check Python version on Windows, macOS, and Unix

Install Transpile
Installation
Upon completing the environment setup, install Remorph by executing the following command:
databricks labs install remorph

Verify Installation
Verify the successful installation by executing the provided command; confirmation of a successful installation is indicated when the displayed output aligns with the example screenshot provided:
databricks labs remorph transpile --help

Install Reconcile
Installation
Install Reconciliation with databricks labs cli.
databricks labs install remorph

Verify Installation
Verify the successful installation by executing the provided command; confirmation of a successful installation is indicated when the displayed output aligns with the example screenshot provided:
databricks labs remorph reconcile --help
