Skip to main content

Introduction to GeoBrix

GeoBrix is a high-performance spatial processing library for Databricks. It ships two interchangeable execution tiers — a lightweight pure-Python/PySpark tier (no JAR, no init script, no native GDAL; runs on Serverless, standard/shared, Lakeflow, and ARM compute) and a heavyweight Scala/GDAL tier (GDAL on Apache Spark, for distributed processing on classic x86 clusters). Both register the same function names, so moving between them is a one-line import swap — and GeoBrix is progressively bringing the lightweight tier to full parity with the heavyweight one. Today the lightweight tier covers all of RasterX, all of VectorX (MVT, TIN surfaces, legacy-geometry migration), and all of GridX (CARTO quadbin, British National Grid (BNG), and custom grids). See Choosing an Execution Tier.

GeoBrix
154
Functions
107
RasterX
40
GridX
6
VectorX
1
PMTiles

GridX's 40 functions break down as BNG (23), Quadbin (10), and custom grids (7).

Supported Databricks Runtimes

GeoBrix supports both current Databricks Runtime LTS releases — a single wheel + single JAR runs on both (Scala 2.13.16 matches both, the JAR is Java-17 bytecode that loads on both JVMs, and Spark is a provided dependency):

DBR LTSUbuntuSparkPythonScalaJavaGeoBrix
17.3 LTS24.044.0.03.12.32.13.1617✅ Supported
18 LTS24.044.1.03.12.32.13.1621✅ Supported
DBR 19 LTS is coming soon

DBR 19 LTS is coming soon, built on Ubuntu 26.04. The lightweight tier (pure-Python, rasterio's bundled GDAL) will be unaffected; the heavyweight tier's native GDAL/OGR libraries are compiled against the cluster OS, so they will need to be rebuilt for the new base image.

Background

Now that product built-in Spatial SQL Functions have reached public preview as of DBR17.1, we are seeking to deliver the next generation of product-augmenting capabilities to help our customers. GeoBrix project is a streamlined iteration to the existing, and quite popular, DBLabs Mosaic project.

Beyond just porting existing Mosaic code, GeoBrix is modernized with expressions designed to work with our Data Intelligence Platform. GeoBrix will be a combination of heavy-weight (e.g. JAR) as well as lightweight (e.g Python, SQL) code artifacts. It also will focus on techniques to use the Databricks platform more widely.

Why GeoBrix?

With Databricks first having acquired MosaicML and now having made a product line, Mosaic AI, it has become clear that the DBLabs Mosaic project, sharing the name, needs to be revamped in name as well as any existing Mosaic capabilities that compete with product investments.

If this were not the case, we would have simply iterated on DBLabs Mosaic "in-place" keeping the same name for what is now called GeoBrix. DBLabs Mosaic is in maintenance mode. The latest/last version of Mosaic targets DBR 13.3 LTS since product introduced ST functions starting with private preview work in DBR 14. As such, Mosaic does not have any awareness of advancements in recent runtimes, including product support for spatial sql and types, and will be retired with DBR 13.3 EoS in AUG 2026.

GeoBrix Vision

Key Features

  • High Performance: Built on Apache Spark for distributed processing
  • GDAL-Powered: Leverages GDAL for heavy-weight spatial operations
  • Databricks Native: Designed specifically for Databricks Runtime
  • Multi-Language Support: APIs available in Scala, Python, and SQL
  • Comprehensive Readers: Support for various geospatial file formats
  • Three Specialized Packages: RasterX, GridX, and VectorX for different spatial needs

What's Next?