Skip to main content

Installation

GeoBrix offers two execution tiers with different installation paths. See Choosing an Execution Tier for the tradeoffs.

Supported Databricks Runtimes

GeoBrix supports both current Databricks Runtime LTS releases:

DBR LTSUbuntuSparkPythonScalaJavaServerless envGeoBrix
17.3 LTS24.044.0.03.12.32.13.16175+ (Py 3.12)✅ Supported
18 LTS24.044.1.03.12.32.13.16215+ (Py 3.12)✅ Supported

A single wheel + single JAR runs on both: Scala 2.13.16 matches both runtimes, the JAR is compiled to Java-17 bytecode so it loads on both JVMs, and Spark is a provided dependency.

The Serverless env column is the minimum Serverless environment version for the lightweight tier: version 5+ provides Python 3.12, which the [light] dependencies require (Python ≥ 3.11). Older environment versions (Python 3.10) can't install geobrix[light]. Release notes for env v5: AWS · Azure · GCP.

DBR 19 LTS is coming soon

DBR 19 LTS is coming soon, built on Ubuntu 26.04. The lightweight tier (pure-Python, rasterio's bundled GDAL) will be unaffected; the heavyweight tier's native GDAL/OGR libraries are compiled against the cluster OS, so they will need to be rebuilt for the new base image.

The lightweight tier is a single Python wheel installed with the light extra — no init script, no JAR, no native GDAL bundle (rasterio's bundled GDAL does the work). It runs on serverless compute, standard (shared) clusters, Lakeflow declarative pipelines, and ARM. One wheel covers the whole lightweight tier: RasterX (the full rst_* set, via databricks.labs.gbx.pyrx), VectorX (via databricks.labs.gbx.pyvx), and GridX quadbin (via databricks.labs.gbx.pygx).

The wheel ships as a GitHub release artifact for GeoBrix 0.4.0+ — it is not published to PyPI. Install it from a Unity Catalog Volume so the same path works on both Serverless and Classic compute:

  1. Download geobrix-<version>-py3-none-any.whl from the GeoBrix releases (0.4.0 or later).
  2. Stage it in a Unity Catalog Volume your compute can read, e.g. /Volumes/<catalog>/<schema>/<volume>/geobrix/.
  3. Install it — either notebook-scoped with the %pip magic (installs across the whole cluster for the notebook session; plain pip installs only on the driver), or as a cluster-scoped library pointing at the same Volume path (works on Serverless and Classic):
%pip install "geobrix[light] @ file:///Volumes/<catalog>/<schema>/<volume>/geobrix/geobrix-<version>-py3-none-any.whl"
Serverless install requirements
  • Use the quoted geobrix[light] @ file://… form. Install with the PEP 508 named form above, wrapped in quotes as a single argument. Do not put the extra on the path ('/Volumes/…/geobrix-<version>-py3-none-any.whl[light]'): on Serverless, %pip writes the requirement to a file including the surrounding quotes, so pip reads [light] as part of the filename and fails with "Expected package name at the start of dependency specifier." The named form installs cleanly on Serverless, standard/shared, and ARM.
  • Use environment version 5+ (Python 3.11+). Set the notebook/job environment version to 5 or later (Python 3.12). The [light] dependencies — notably rio-tiler (≥ 9) — require Python ≥ 3.11, so older default Serverless environments (Python 3.10) cannot resolve them and the install fails with "Could not find a version that satisfies the requirement rio-tiler…". Set it in the Environment side panel, or with environment_version: "5" on a serverless job task. Release notes: AWS · Azure · GCP.

Then import the package(s) you need and (optionally) register their SQL functions. Each light package exposes the same functions / register(spark) pattern:

from databricks.labs.gbx.pyrx import functions as rx  # RasterX (rst_*)
# from databricks.labs.gbx.pyvx import functions as vx # VectorX (st_*)
# from databricks.labs.gbx.pygx import functions as gx # GridX quadbin (quadbin_*)

# optional — only needed to call the SQL functions:
rx.register(spark)

# Light DataSource readers (raster_gbx, gtiff_gbx, etc.) are Python DataSource
# V2 and aren't auto-discovered — register them explicitly when you need them:
# from databricks.labs.gbx.ds import register
# register.register(spark)

Verify the lightweight tier

Confirm the wheel imports, rasterio's bundled GDAL is present, and a function runs end to end (no JAR, no init script). The check below exercises RasterX (rst_*); VectorX (pyvx, st_*) and GridX quadbin (pygx, quadbin_*) ship in the same wheel and register and run the same way:

import rasterio
from databricks.labs.gbx.pyrx import functions as rx

print("rasterio", rasterio.__version__, "| bundled GDAL", rasterio.__gdal_version__)

# Functional check: load a GeoTIFF and read its properties through pyrx.
tiles = (
spark.read.format("binaryFile")
.load("/Volumes/<catalog>/<schema>/<volume>/path/to/*.tif")
.select(rx.rst_fromcontent("content", "GTiff").alias("tile"))
)
tiles.select(
rx.rst_width("tile").alias("width"),
rx.rst_height("tile").alias("height"),
rx.rst_srid("tile").alias("srid"),
).show()

If the import succeeds and the query returns raster dimensions, the lightweight tier is ready. The rest of this page covers the heavyweight tier.