Functions Overview

GeoBrix provides four specialized packages for different spatial processing needs. All packages expose Scala, Python, and SQL APIs backed by the same Spark columnar expressions.

GeoBrix Vision

Available Packages

Full-spectrum raster processing for Databricks — successor to Mosaic raster, plus terrain analysis, spectral indices, tile publishing, and vector↔raster bridging.

Process GeoTIFF and other GDAL-supported raster formats
Raster algebra, transformations, clipping, reprojection
Metadata extraction, band operations, NoData handling
Resample, IDW interpolation, build overviews
Terrain analysis (slope, aspect, hillshade, TRI, TPI, roughness, color-relief)
Spectral indices (EVI, SAVI, NDWI, NBR, NDVI, plus a generic dispatcher)
Vector↔raster bridge (rasterize / polygonize)
Web-mercator XYZ tile output (to_webmercator, tilexyz, xyzpyramid)
Grid aggregations to H3 or CARTO quadbin v0 cells
COG output, proximity, contour, viewshed

Raster Function Reference →

Spatial indexing across multiple discrete-global-grid systems: BNG (British National Grid) for Great Britain workloads and CARTO quadbin v0 for web-mercator-aligned analytics.

British National Grid (BNG) — 21 functions covering cell math, kring/kloop, polyfill, tessellation, aggregators, and generators
CARTO Quadbin v0 — 9 functions (pointascell, aswkb, centroid, resolution, polyfill, kring, tessellate, cellunion, distance); cell IDs are 64-bit Long, aligned with the web-mercator XYZ tile grid
Cell area calculations, k-ring / k-loop neighborhoods, geometry-to-cell tessellation

GridX Function Reference →

Augments Databricks built-in ST_* functions with vector-tile encoding, TIN surface modeling, and legacy-Mosaic migration helpers.

Mapbox Vector Tile (MVT) encoding via st_asmvt aggregator
Vector tile pyramid via st_asmvt_pyramid generator — composes with pmtiles_agg for end-to-end publishing
TIN surface modeling — st_triangulate plus st_interpolateelevationbbox / st_interpolateelevationgeom (constrained-Delaunay triangulation and grid elevation interpolation from Z-valued points)
Legacy Mosaic geometry conversion (migrate without installing Mosaic)
OGR-based reader data sources (Shapefile, GeoJSON, GeoPackage, FileGDB)

VectorX Function Reference →

Tier-agnostic visualization helpers for inspecting GeoBrix outputs in a notebook — Python-only, no SQL functions.

Raster plotting (plot_raster / plot_file) with auto-decimation and percentile stretch, including coverage-depth and mask-layer composites for multi-band presence stacks
Spark DataFrame → GeoPandas adapters (as_gdf / cells_as_gdf / grid_as_gdf) for .plot() / .explore() maps

VizX Reference →

PMTiles

Container format for serving raster (PNG / JPEG / WebP) or vector (MVT) tile pyramids from a single static file via HTTP range requests. Native Scala v3 encoder — no GDAL/OGR dependency.

gbx_pmtiles_agg UDAF — aggregator returning a BINARY PMTile blob; fits tilesets up to ~100 MiB tile payload / 2 GiB cell limit
.write.format("pmtiles").save(path) DataSource — streams larger pyramids via a partitioned commit protocol
Auto-detects tile_type from magic bytes (PNG / JPEG / WebP / otherwise MVT)
Composes with gbx_rst_xyzpyramid (raster) and gbx_st_asmvt_pyramid (vector) upstream

Package Comparison

Feature	RasterX	GridX	VectorX	PMTiles	VizX
Primary Use	Raster processing	Discrete global grids	Vector encoding + legacy	Tile pyramid packaging	Notebook visualization
Product Gap	Full gap-filling	Specialized grids (BNG, quadbin)	Vector-tile encoding, legacy migration	Net-new	Supporting layer
GDAL Required	Yes	No	Yes (readers + MVT)	No	No
Output Format	Tile (struct) + arrays	Cell IDs (Long / String) + WKB	BINARY (MVT bytes), WKB	BINARY (PMTile blob) or file	Matplotlib figure / GeoDataFrame
Spark Surface	65+ SQL functions	30+ SQL functions	6+ SQL functions + DataSources	1 UDAF + 1 DataSource	Python only (no SQL)

Choosing the Right Package

Use RasterX when: working with satellite imagery, DEMs, or aerial photography; performing terrain analysis, spectral indices, or per-pixel transforms; aggregating raster pixels to H3 or quadbin cells; bridging vector geometries to/from rasters; generating web-mercator XYZ tiles.

Use GridX when: working with British National Grid data; indexing global data into web-mercator-aligned quadbin cells; needing cell math (area, k-ring, polyfill, tessellation); building grid-aware aggregations or join keys.

Use VectorX when: encoding features as Mapbox Vector Tiles; generating per-tile MVT layers; building TIN surfaces / interpolating elevation from Z-valued points; reading vector formats (Shapefile, GeoJSON, GeoPackage, FileGDB); migrating from DBLabs Mosaic.

Use PMTiles when: publishing a tile pyramid (raster or vector) as a single static file; serving from S3/ABFS/GCS without a tile server; aggregating (z, x, y, bytes) rows into a deployable map.

Use VizX when: inspecting GeoBrix raster tiles or files in a notebook (single-band, RGB, coverage-depth, or mask-layer composites); turning Spark geometry / H3-cell / gridspec DataFrames into GeoPandas for .plot() / .explore() maps. VizX is a Python-only supporting layer for visualization, not a spatial-processing package.

Function Naming Convention

All GeoBrix SQL functions use the gbx_ prefix:

Package	Prefix	Example
RasterX	`gbx_rst_`	`gbx_rst_boundingbox`
GridX/BNG	`gbx_bng_`	`gbx_bng_cellarea`
GridX/Quadbin	`gbx_quadbin_`	`gbx_quadbin_pointascell`
VectorX	`gbx_st_`	`gbx_st_asmvt`
PMTiles	`gbx_pmtiles_`	`gbx_pmtiles_agg`

The gbx_ prefix distinguishes GeoBrix functions from Databricks built-in st_* functions.

Registration

Before using GeoBrix functions in Python or SQL, register them with the Spark session.

Register all packages

# Register all packages
from databricks.labs.gbx.rasterx import functions as rx
from databricks.labs.gbx.gridx.bng import functions as bx
from databricks.labs.gbx.vectorx.jts.legacy import functions as vx

rx.register(spark)
bx.register(spark)
vx.register(spark)

Register selectively

GeoBrix raster functions come in two execution tiers; this example uses the heavyweight tier. See Choosing an Execution Tier.

# Only register RasterX
from databricks.labs.gbx.rasterx import functions as rx
rx.register(spark)

Scala

Register All Packages
import com.databricks.labs.gbx.rasterx.{functions => rx}
import com.databricks.labs.gbx.gridx.bng.{functions => bx}
import com.databricks.labs.gbx.vectorx.jts.legacy.{functions => vx}

// Register each package
rx.register(spark)
bx.register(spark)
vx.register(spark)

Example output
RasterX, GridX, and VectorX functions registered (gbx_rst_*, gbx_bng_*, gbx_st_*).

SQL

SQL functions are registered via Python or Scala. Once registered, they are available in any SQL context:

-- No registration needed in SQL
-- Functions are available after Python/Scala registration

SHOW FUNCTIONS LIKE 'gbx_*';

Example output
+--------------------+
|function            |
+--------------------+
|gbx_rst_asformat    |
|gbx_rst_avg         |
|gbx_rst_bandmetadata|
+--------------------+

API Languages

GeoBrix provides the same functionality across three languages — SQL, Python, and Scala. See Language Bindings for the registration and import patterns each surface uses, and the one-line lightweight-vs-heavyweight raster swap.

Scalar values vs `lit(...)` wrapping

In 0.3.0, plain scalars are accepted across Python, Scala, and SQL bindings — no f.lit(...) wrapping required for non-string values.

Python — wrappers accept Column or scalar (bool/int/float/bytes); non-string scalars are auto-wrapped with f.lit(...). Strings still follow pyspark's column-reference convention (bare string ≈ f.col(name)); wrap in f.lit("...") to pass a string literal.

# ✅ Before 0.3.0 — required f.lit for every value
rx.rst_clip("tile", "geom", f.lit(True))
rx.rst_transform("tile", f.lit(4326))
bx.bng_pointascell("pt", f.lit(1))
bx.bng_pointascell("pt", f.lit("1km"))

# ✅ 0.3.0 — scalars accepted directly
rx.rst_clip("tile", "geom", True)
rx.rst_transform("tile", 4326)
bx.bng_pointascell("pt", 1)
bx.bng_pointascell("pt", f.lit("1km"))   # string literal — still wrap in f.lit

Scala — typed overloads added for Boolean / Int / Double / String value parameters. Column args (e.g. geometry, tile) still take Column.

// ✅ 0.3.0 — scalar overloads resolve without lit(...)
rst_clip(col("tile"), col("geom"), cutlineAllTouched = true)
rst_transform(col("tile"), 4326)
bng_pointascell(col("pt"), 1)
bng_pointascell(col("pt"), "1km")

SQL — values are already natively accepted by Spark SQL; no change needed:

SELECT gbx_rst_clip(tile, geom, true) FROM ...;
SELECT gbx_bng_pointascell(pt, 1) FROM ...;
SELECT gbx_bng_pointascell(pt, '1km') FROM ...;

When you still need f.lit(...) in Python:

String literals: rx.rst_fromfile(f.lit("/path/to.tif"), f.lit("GTiff")) — a bare string is treated as a column reference.
Nulls / explicit typing: e.g. f.lit(None).cast("double").

Available Packages​

​

​

​

​

PMTiles​

Package Comparison​

Choosing the Right Package​

Function Naming Convention​

Registration​

Register all packages​

Register selectively​

Scala​

SQL​

API Languages​

Scalar values vs lit(...) wrapping​

Next Steps​