Installation
GeoBrix currently offers heavy-weight, distributed APIs, primarily written in Scala for Spark with additional language bindings for PySpark and Spark SQL.
Prerequisites
- Databricks Runtime 17.3 LTS. The init script's pinned GDAL package
targets the Ubuntu 24.04 (
noble) base used by DBR 17.3 LTS; other runtimes use different bases and will either fail to install or silently fall back to an unpinned GDAL. - Classic Databricks cluster (not Serverless)
- GDAL native libraries (installed by the init script below)
Installation Steps
1. Download GeoBrix Artifacts
GeoBrix requires the following artifacts:
- JAR file:
geobrix-*-jar-with-dependencies.jar - Shared Object:
libgdalalljni.so(GDAL native library) - Python Wheel:
geobrix-*-py3-none-any.whl
These are currently delivered via Releases artifacts.
2. Upload to Databricks Volume
- Create or use an existing Databricks Volume
- Upload the following files to your Volume (
*for version):geobrix-*-jar-with-dependencies.jarlibgdalalljni.sogeobrix-*-py3-none-any.whl
3. Create Init Script
GeoBrix requires GDAL natives, which are currently best installed via an init script on a classic cluster.
- Use the init script from the repo: scripts/geobrix-gdal-init.sh (shown below)
- Modify the
VOL_DIRvariable to point to your Volume location where you uploaded the artifacts - Upload the modified init script to your Databricks Volume
What the script does (see Security for the full rationale):
- Adds the UbuntuGIS PPA using an inline-embedded signing key, verified against the expected fingerprint (
UBUNTUGIS_FPR) before any package install. - Installs the pinned GDAL package version (
GDAL_PPA_VERSION) from the verified PPA so the native version matches the JNI binding shipped in the JAR. - Installs the Python
GDALwheel from source against the pinned headers (no opportunistic wheel from PyPI). - Copies
libgdalalljni.soandgeobrix-*-jar-with-dependencies.jarfromVOL_DIRinto place.
VOL_DIR is the only line you should edit.
#!/bin/bash
#
# Databricks cluster init script. This file is uploaded to a Workspace
# volume and run by the cluster on boot — the ubuntugis PPA signing key is embedded
# inline below. Keep this file self-contained.
set -euo pipefail
sudo add-apt-repository -y "deb http://archive.ubuntu.com/ubuntu $(lsb_release -sc)-backports main universe multiverse restricted"
sudo add-apt-repository -y "deb http://archive.ubuntu.com/ubuntu $(lsb_release -sc)-updates main universe multiverse restricted"
sudo add-apt-repository -y "deb http://archive.ubuntu.com/ubuntu $(lsb_release -sc)-security main multiverse restricted universe"
sudo add-apt-repository -y "deb http://archive.ubuntu.com/ubuntu $(lsb_release -sc) main multiverse restricted universe"
# - add ubuntugis PPA with fingerprint-pinned GPG key.
# We do NOT call `add-apt-repository ppa:ubuntugis/ubuntugis-unstable`:
# that helper auto-installs whatever key Launchpad serves (TOFU). Instead
# the signing key is embedded below and rejected unless its fingerprint
# matches UBUNTUGIS_FPR — so a tampered cluster image, a swapped key
# block in this script, or a Launchpad MITM all fail closed before any
# GDAL package gets pulled through the PPA's signing chain.
#
# Expected fingerprint sourced from Launchpad's signing_key_fingerprint API:
# curl https://launchpad.net/api/1.0/~ubuntugis/+archive/ubuntu/ubuntugis-unstable \
# | jq -r .signing_key_fingerprint
# Re-verify on key bump and update the embedded block below in lockstep.
UBUNTUGIS_FPR="2EC86B48E6A9F326623CD22FFF0E7BBEC491C6A1"
UBUNTUGIS_KEYRING="/etc/apt/keyrings/ubuntugis.gpg"
UBUNTUGIS_LIST="/etc/apt/sources.list.d/ubuntugis-unstable.list"
sudo apt-get install -y software-properties-common gpg
UBUNTUGIS_KEY_ASC="$(mktemp)"
trap 'rm -f "$UBUNTUGIS_KEY_ASC"' EXIT
cat > "$UBUNTUGIS_KEY_ASC" <<'UBUNTUGIS_KEY_EOF'
-----BEGIN PGP PUBLIC KEY BLOCK-----
Comment: Hostname:
Version: Hockeypuck 2.2
xsFNBGYzcWcBEACZy6Cs/d6xE5dYOX7MY9nMNGALohNGal+lT/gvuU16NYrXV/qs
7NyOLjUmFuEflrbMbOuqW6XaK8FRCkOCMbJAGcxlieLK7e2oV472rw/fMVJYk9du
ebQoYcNfB4Pylb4xpZvG9+zwWWICMZG8JlcV+hLWAC5L9WY/6GycRZMarukPntY5
f9r6KMohMtcpiqjtpIccTKbxLwB/wPRTri2+clSG1PABhIhLzQqQv2qIlsVGjt0r
eP1DjoNin0yrBrsNZysVSEQW4/3KEW4PN4VqhoGwrNPygN0dwCyQ/yn+ulFhwzgI
KTGlDkEEn+ozONMIccWjGxck3SCjCCH2QO3UwX10AifChgFoms5mKuE0MLYRqgWK
wPGly5n5yBOhz8ctXRQ7L0613hJ6GiBkZMqOTIdXY4NT52e6tsXTaJ/Jx4VwFg64
j0qJZ5TE1Z//kSTpEmEELsq0rl3Iz9gxeMqalVhoJXBRKb7MMwJn4p0rjbhp9jWj
4tN26LqwLfCNVPrEomUG7ERG6Rs45CfPOh3bLCm9yd3++bcAGN8ne3F1YABY/kyf
bXtjQ/ihhpFMbqUtcUkEIS8xfbnwdORvH+wmaBbSpaMW1JCJNmM3KsdzY16PsckO
Z7YHAqZacirlNN/dZbsFLow958ssjwgGYquVNhiBckE2vIzObdrcHqsx8QARAQAB
zRtMYXVuY2hwYWQgUFBBIGZvciBVYnVudHVHSVPCwY4EEwEKADgWIQQuyGtI5qnz
JmI80i//Dnu+xJHGoQUCZjNxZwIbAwULCQgHAgYVCgkICwIEFgIDAQIeAQIXgAAK
CRD/Dnu+xJHGoY8RD/9nviKd8w55J7MxUhI3s6ka15BXqKamZ7zmVn+nYNU9QY3V
HK3gh1Z1SytNcS572AZuym1dTGe779zfIchQ6VN8aFwhLTKMyg4FBGP0opYCPEG1
y2wwcSTNeOyiwPBECYae0tXi9btYB3GswO30GaQXTpKAy0LDaHSm4zfUkKfnofAQ
lZdznTXgxUJqSn8fzFMIY4bDEImgRp1TS5sIavKQKpFLNJKP1bnCl1/YSTm67SOx
rH1Q0URKJIRsgfj/L4Rt1SW8EZqFb9tDHfcfGSpdvD7LWe7NMVYHBn9CUsSMbfW8
SwBkUAw/6l0ODeKmUNqSbYTia0GBhX/LwsFrc3cydSlX8NZSKwGztM9F+tOHXaS9
eVap7Ow6dTuaw/fyJIf57PAVSAkmJ41nSAygr4XaleDTJXHE4T0tHWusb3AXdKUR
4bSthlSQKrFnYnLTBKuN5ijQ5TLzFbMjD22JvFpSQeQeGYkjNfmLOcLU1p4pWCM+
z5EgjOJcGPbjFqlEkMraUPONJuzFdAnx6d7OdGY9TWserSuI8+392mXhU+9SiS8T
nrbb0Y/WYJmcqkQRmwe6eCs7G+3UJhulUKWEYm37255aNiHKJl+FZEgZ9Zh5tsN/
RrcIov5r9ncdNv8VP6c6IkOCbH9bOo4jto02TV/WMACEcXCVU7nZCdbCYpHCqA==
=cYNc
-----END PGP PUBLIC KEY BLOCK-----
UBUNTUGIS_KEY_EOF
actual_fpr=$(gpg --show-keys --with-fingerprint --with-colons "$UBUNTUGIS_KEY_ASC" \
| awk -F: '/^fpr:/ {print $10; exit}')
if [ -z "$actual_fpr" ] || [ "$actual_fpr" != "$UBUNTUGIS_FPR" ]; then
echo "ubuntugis key fingerprint mismatch: got='${actual_fpr}' expected='${UBUNTUGIS_FPR}'" >&2
exit 1
fi
sudo install -d -m 0755 /etc/apt/keyrings
sudo gpg --dearmor --yes -o "$UBUNTUGIS_KEYRING" < "$UBUNTUGIS_KEY_ASC"
sudo chmod 0644 "$UBUNTUGIS_KEYRING"
CODENAME="$(lsb_release -sc)"
{
echo "deb [signed-by=${UBUNTUGIS_KEYRING}] https://ppa.launchpadcontent.net/ubuntugis/ubuntugis-unstable/ubuntu ${CODENAME} main"
echo "deb-src [signed-by=${UBUNTUGIS_KEYRING}] https://ppa.launchpadcontent.net/ubuntugis/ubuntugis-unstable/ubuntu ${CODENAME} main"
} | sudo tee "$UBUNTUGIS_LIST" >/dev/null
sudo apt-get update -y
# Update VOL_DIR to point at the Unity Catalog volume where you've staged
# libgdalalljni.so + geobrix-*-jar-with-dependencies.jar before deploying
# this script to a cluster.
VOL_DIR="/Volumes/geospatial_docs/gdal_artifacts/noble/geobrix"
if [ ! -d "$VOL_DIR" ]; then
echo "VOL_DIR not found: $VOL_DIR" >&2
echo "Edit this script and set VOL_DIR to the volume containing the GeoBrix native + JAR artifacts before re-running." >&2
exit 1
fi
# install natives — keep GDAL_PPA_VERSION in sync with CI (.github/actions/*/action.yml).
# https://gdal.org/en/stable/api/python/python_bindings.html
# https://medium.com/@felipempfreelancer/install-gdal-for-python-on-ubuntu-24-04-9ed65dd39cac
GDAL_PPA_VERSION="3.11.4+dfsg-1~noble0"
sudo apt-get -o DPkg::Lock::Timeout=-1 install -y unixodbc libcurl3-gnutls libsnappy-dev libopenjp2-7
sudo apt-get -o DPkg::Lock::Timeout=-1 install -y \
"libgdal-dev=${GDAL_PPA_VERSION}" \
"gdal-bin=${GDAL_PPA_VERSION}" \
"python3-gdal=${GDAL_PPA_VERSION}"
# pip install GDAL (match deps to DBR 17.3 LTS — see release notes for the runtime).
# Bootstrap pins must match .github/actions/{scala,python}_build/action.yml — keep these in sync.
pip install --upgrade pip==25.0.1 setuptools==74.0.0 wheel==0.45.1 cython==3.0.12
pip install numpy==2.1.3
export GDAL_CONFIG=/usr/bin/gdal-config
# --no-binary :all: forces sdist compile against the apt-installed libgdal
# headers above (signed by the fingerprint-pinned ubuntugis key), rather
# than accepting whatever pre-built wheel PyPI happens to serve.
pip install --no-cache-dir --no-binary :all: --force-reinstall GDAL[numpy]=="$(gdal-config --version).*"
# copy JNI and JAR. Quote VOL_DIR so paths with spaces don't break under
# `set -u`; the glob expands after substitution.
cp "$VOL_DIR/libgdalalljni.so" /usr/lib/libgdalalljni.so
cp "$VOL_DIR"/geobrix-*-jar-with-dependencies.jar /databricks/jars
4. Configure Cluster
Add Init Script
- Go to your cluster configuration
- Navigate to Advanced Options > Init Scripts
- Add the init script path from your Volume
Add Libraries
- In your cluster configuration, go to Libraries
- Click Install new
- Install the Python wheel:
- Select Upload > Python Whl
- Select the
dblabs_geobrix-*-py3-none-any.whl(version should match the JAR) file from your Volume, e.g.VOL_DIRlocation
- Install the JAR:
- The JAR is installed via the init script, so nothing further is needed
Cluster Configuration Example
{
"cluster_name": "geobrix-cluster",
"spark_version": "17.3.x-scala2.13",
"node_type_id": "Standard_DS3_v2",
"num_workers": 2,
"init_scripts": [
{
"volumes": {
"destination": "/Volumes/catalog/schema/volume_name/geobrix-gdal-init.sh"
}
}
]
}
5. Start the Cluster
Start or restart your cluster to ensure the init script runs and all libraries are loaded.
Verification
To verify that GeoBrix is installed correctly (Scala similar to Python):
Python
from databricks.labs.gbx.rasterx import functions as rx
# Register functions
rx.register(spark)
# List registered functions
spark.sql("SHOW FUNCTIONS LIKE 'gbx_rst_*'").show()
+--------------------+
|function |
+--------------------+
|gbx_rst_asformat |
|gbx_rst_avg |
|gbx_rst_bandmetadata|
|gbx_rst_boundingbox |
|... |
+--------------------+
SQL
-- List all GeoBrix functions
SHOW FUNCTIONS LIKE 'gbx_*';
-- Describe a specific function
DESCRIBE FUNCTION EXTENDED gbx_rst_boundingbox;
+--------------------+
|function |
+--------------------+
|gbx_rst_asformat |
|gbx_rst_avg |
|gbx_rst_boundingbox |
|gbx_bng_cellarea |
|... |
+--------------------+
-DESCRIBE FUNCTION EXTENDED gbx_rst_boundingbox
Function: gbx_rst_boundingbox
Type: ...
If you see the GeoBrix functions listed, your installation is successful!
Troubleshooting
GDAL Library Issues
The init script installs GDAL from the UbuntuGIS PPA using a pinned package version (GDAL_PPA_VERSION) and a fingerprint-verified signing key (UBUNTUGIS_FPR). Common failure modes:
- Fingerprint mismatch — the script exits with
ubuntugis key fingerprint mismatch: got='…' expected='…'. The embedded key block orUBUNTUGIS_FPRhas been tampered with, or Launchpad rotated its signing key. Re-download the init script from the release artifacts and re-verify the fingerprint at Launchpad's API. See the Security page for the rationale. - Pinned package version unavailable —
apt-get installfails with "Version…not found". The PPA has retired that build. Use a newer GeoBrix release whose init script pins a still-available version, rather than locally editing the pin (the JNI binding in the JAR must match). - DBR base image mismatch — the script targets the Ubuntu 24.04 (
noble) base used by DBR 17.3 LTS. - General checks — confirm the init script ran successfully (driver logs) and that
libgdalalljni.sois in/usr/libon the driver and executors. Adding/usr/libtoLD_LIBRARY_PATHis rarely needed.
Function Registration Issues
If functions are not executing (uncommon):
- Verify the JAR is in
VOL_DIRused in the init script - Check that you've called the
.register(spark)method for Spark SQL bindings - Restart the Python kernel or re-attach to cluster in the SQL notebook
Permission Issues
If you encounter permission errors:
- Ensure you have read access to the Volume
- Check that the init script has execute permissions
- Verify cluster policies allow init scripts
Next Steps
- Follow the Quick Start Guide to begin using GeoBrix
- Explore the Packages documentation