Skip to main content

FAQ

Guidance for Oracle as a source

Driver

Option 1

  • Download ojdbc8.jar from Oracle: Visit the official Oracle website to acquire the ojdbc8.jar JAR file. This file is crucial for establishing connectivity between Databricks and Oracle databases.

  • Install the JAR file on Databricks: Upon completing the download, install the JAR file onto your Databricks cluster. Refer to this page For comprehensive instructions on uploading a JAR file, Python egg, or Python wheel to your Databricks workspace.

Option 2

  • Install ojdbc8 library from Maven: Follow this guide to install the Maven library on a cluster. Refer to this document for obtaining the Maven coordinates.

This installation is a necessary step to enable seamless comparison between Oracle and Databricks, ensuring that the required Oracle JDBC functionality is readily available within the Databricks environment.

Commonly Used Custom Transformations

source_typedata_typesource_transformationtarget_transformationsource_value_exampletarget_value_examplecomments
Oraclenumber(10,5)
"trim(to_char(coalesce(col_name,0.0), ’99990.99999’))"
"cast(coalesce(col_name,0.0) as decimal(10,5))"
1.001.00000this can be used for any precision and scale by adjusting accordingly in the transformation
Snowflakearray
"array_to_string(array_compact(col_name),’,’)"
"concat_ws(’,’, col_name)"
[1,undefined,2][1,2]in case of removing "undefined" during migration(converts sparse array to dense array)
Snowflakearray
"array_to_string(array_sort(array_compact(col_name), true, true),’,’)"
"concat_ws(’,’, col_name)"
[2,undefined,1][1,2]in case of removing "undefined" during migration and want to sort the array
Snowflaketimestamp_ntz
"date_part(epoch_second,col_name)"
"unix_timestamp(col_name)"
2020-01-01 00:00:00.0002020-01-01 00:00:00.000convert timestamp_ntz to epoch for getting a match between Snowflake and data bricks