Skip to main content

RasterX Function Reference

RasterX functions run in two execution tiers — lightweight (pure-Python pyrx) and heavyweight (rasterx). As of v0.4.0, every RasterX function is available in both tiers (badged <Tier both/> per function below); per-function notes call out where the lightweight implementation differs. See Choosing an Execution Tier for the comparison and the lightweight install.

Complete reference for all RasterX functions with detailed descriptions, parameters, return values, and examples.

Overview

RasterX is GeoBrix's raster data processing package, providing comprehensive tools for working with raster datasets such as satellite imagery, elevation models, and other gridded spatial data. It is a refactor and improvement of Mosaic raster functions, extended in v0.4.0 with terrain analysis, spectral indices, vector-raster bridging, web-mercator tile output, and quadbin grid aggregations. Since the Databricks product does not (yet) support anything built-in specifically for raster processing, RasterX provides a gap-filling capability for raster operations on the Databricks platform.

Key Features

  • GDAL-Powered: Leverages GDAL for robust raster format support
  • Distributed Processing: Built on Spark for scalable raster operations
  • Multiple Format Support: GeoTIFF, COG, NetCDF, and other GDAL-supported formats
  • Metadata Extraction: Comprehensive raster metadata access
  • Raster Operations: Clipping, resampling, transformations, map algebra
  • Band Operations: Multi-band raster support, single-band extraction
  • Terrain Analysis: Slope, aspect, hillshade, TRI, TPI, roughness, color-relief
  • Spectral Indices: EVI, SAVI, NDWI, NBR, NDVI, plus a generic dispatcher
  • Vector-Raster Bridge: Rasterize geometries, polygonize value regions
  • Tile Publishing: Web-mercator XYZ tile generation (PNG / JPEG / WebP)
  • Grid Aggregations: H3 and CARTO quadbin v0 cell aggregations

Function Categories

RasterX exposes 87+ SQL functions (registered as gbx_rst_*; available in Python and Scala as rst_*), organized into the following categories (see rasterx/functions.scala):

RasterX function categories — Constructors, Accessors, Aggregators, Generators, Operations, H3 Grid

  • Accessor Functions: Read raster properties and metadata (bounds, dimensions, CRS, bands, pixel size, georeference, format, type, NoData, subdatasets, summary, etc.)
  • Aggregator Functions: Combine or merge rasters in group-by (combineavg_agg, derivedband_agg, merge_agg)
  • Constructor Functions: Create or load rasters from paths, binary content, or bands
  • Generator Functions: Produce multiple tiles or bands (h3_tessellate, maketiles, retile, separatebands, tooverlappingtiles)
  • Grid Functions (H3): Aggregate raster values to H3 cells (rastertogrid avg/count/max/min/median)
  • Grid Functions (quadbin): Aggregate raster values to CARTO quadbin v0 cells (rastertogrid avg/count/max/min/median)
  • Operations: Transform and analyze rasters (clip, transform, merge, asformat, ndvi, filter, convolve, map algebra, coordinate conversion, isEmpty, tryOpen, initNoData, updateType, combineavg, derivedband)
  • Web-Mercator Tile Output: Reproject to EPSG:3857 and emit slippy-map XYZ tiles (to_webmercator, tilexyz, xyzpyramid)
  • Vector-raster bridge: Burn polygons into rasters and trace contiguous regions back to polygons (rasterize, polygonize)
  • Terrain Analysis: DEM-derived surfaces from gdal.DEMProcessing (slope, aspect, hillshade, TRI, TPI, roughness, color relief)
  • Spectral Indices: Multi-band satellite math (EVI, SAVI, NDWI, NBR, plus the generic rst_index dispatcher)

Tile payload

Every RasterX function returns a tile whose raster field is a self-contained, in-memory raster (GTiff by default) — safe to serialize between Spark stages and executors, persist to Delta, hand off to rasterio / gdal, or write back out via the gdal writer. The bytes are never an XML reference to a per-executor /vsimem/ tempfile or to a path that only exists on the producing node.

Functions that internally build via an intermediate VRT — gbx_rst_merge, gbx_rst_merge_agg, gbx_rst_frombands, gbx_rst_combineavg, gbx_rst_combineavg_agg, gbx_rst_derivedband, gbx_rst_derivedband_agg — materialize the result to GTiff before returning, so downstream stages on different executors see real raster bytes. Inspect a tile's payload format from tile.metadata.driver; for any of the functions above, it will read GTiff (not VRT). See Beta Release Notes for the v0.3.0 correctness fix that introduced this invariant. See Tile structure for the full tile-struct schema.

Setup

Pick your execution tier and run this once. Both tiers alias the module as rx, so every example below is identical regardless of tier — only this setup differs. (See Choosing an Execution Tier for the comparison.)

Lightweight setup (pyrx)
from pyspark.sql import functions as f
from databricks.labs.gbx.pyrx import functions as rx

# Build a 4 x 3, 2-band float32 GTiff in memory (origin 10.0, 50.0; 0.5 px; EPSG:4326).
raster_bytes = _make_geotiff_bytes(width=4, height=3, count=2, epsg=4326)

df = spark.createDataFrame([(raster_bytes,)], ["raster"])
tile_df = df.select(rx.rst_fromcontent("raster", f.lit("GTiff")).alias("tile"))
tile_df.createOrReplaceTempView("rasters")
Example output
One-row DataFrame with a tile column (struct<cellid, raster, metadata>).
Temp view `rasters` available for SQL examples.

Usage Examples

Python/PySpark

These examples assume your tier is set up as in Setup above — imported as rx, registered (for the SQL examples), and a raster DataFrame (raster_df, with a tile column) loaded for your tier. The calls are identical in both tiers:

# Read raster properties off the `tile` column:
metadata_df = raster_df.select(
rx.rst_width("tile").alias("width"),
rx.rst_height("tile").alias("height"),
rx.rst_numbands("tile").alias("bands"),
rx.rst_srid("tile").alias("srid"),
)
metadata_df.show()

Scala

import com.databricks.labs.gbx.rasterx.{functions => rx}
import org.apache.spark.sql.functions._

// Register functions
rx.register(spark)

// Read raster files (sample data path; see Sample Data guide)
val rasterPath = "/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/sentinel2/nyc_sentinel2_red.tif"
val rasterDf = spark.read.format("gdal").load(rasterPath)

// Get metadata
val metadataDf = rasterDf.select(
col("path"),
rx.rst_width(col("tile")).alias("width"),
rx.rst_height(col("tile")).alias("height"),
rx.rst_numbands(col("tile")).alias("num_bands")
)

metadataDf.show()
Example output
+--------------------+-----+------+----------+
|path |width|height|num_bands |
+--------------------+-----+------+----------+
|.../nyc_sentinel2...|10980|10980 |1 |
+--------------------+-----+------+----------+

SQL

-- Register functions first in Python/Scala notebook
-- Then use in SQL

-- Read raster data (sample data path; see Sample Data guide)
CREATE OR REPLACE TEMP VIEW rasters AS
SELECT * FROM gdal.`{SAMPLE_RASTER_PATH}`;

-- Extract metadata
SELECT
path,
gbx_rst_width(tile) as width,
gbx_rst_height(tile) as height,
gbx_rst_numbands(tile) as num_bands,
gbx_rst_srid(tile) as srid
FROM rasters;
Example output
+--------------------+-----+------+----------+----+
|path |width|height|num_bands |srid|
+--------------------+-----+------+----------+----+
|.../nyc_sentinel2...|10980|10980 |1 |4326|
+--------------------+-----+------+----------+----+

SQL examples

Examples on this page use SQL, where RasterX functions are prefixed with gbx_ (e.g. gbx_rst_boundingbox, gbx_rst_width). For Python and Scala usage and more tips, see Language Bindings. In the lightweight tier, the registered gbx_rst_* SQL functions require every argument to be passed explicitly — optional defaults are honored only through the Python prx.* API.

Tier availability

As of v0.4.0, all RasterX functions run in both execution tiers — the lightweight pyrx (pure-Python) and heavyweight rasterx tiers share the same rst_* / gbx_rst_* names, and each function below carries a :::note Lightweight tier (pyrx) admonition with its backing library and any behavioral differences. For the heavyweight VRT Python pixel-function configuration (used by gbx_rst_combineavg / gbx_rst_derivedband), see VRT Python pixel functions at the end of this page.

Accessor Functions

Functions to read raster properties and metadata (29 total).

rst_avg

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio + NumPy. Per-band mean over valid (non-NoData) pixels.

Signature: rst_avg(tile: Column): Column — Per-band average pixel values.

SQL:

-- Get average values
SELECT
path,
gbx_rst_avg(tile) as band_averages,
gbx_rst_avg(tile)[0] as band1_avg
FROM rasters;

-- Filter by average threshold
SELECT * FROM rasters
WHERE gbx_rst_avg(tile)[0] > 50.0;
Example output
+----+-------------+---------+
|path|band_averages|band1_avg|
+----+-------------+---------+
|... |[0.42] |0.42 |
+----+-------------+---------+

rst_bandmetadata

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio.

Signature: rst_bandmetadata(tile: Column, band: Column): Column — Band metadata map.

SQL:

SELECT gbx_rst_bandmetadata(tile, 1) as band1_metadata FROM rasters;
Example output
+--------------+
|band1_metadata|
+--------------+
|{...} |
+--------------+

rst_boundingbox

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio.

Signature: rst_boundingbox(tile: Column): Column — Bounding box geometry.

SQL:

SELECT path, gbx_rst_boundingbox(tile) as bbox FROM rasters;
Example output
+--------------------+-----------------+
|path |bbox |
+--------------------+-----------------+
|.../nyc_sentinel2...|POLYGON ((-74....|
+--------------------+-----------------+

rst_format

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio.

Signature: rst_format(tile: Column): Column — GDAL format name.

SQL:

-- Identify formats
SELECT
gbx_rst_format(tile) as format,
COUNT(*) as count
FROM rasters
GROUP BY gbx_rst_format(tile);

-- Find non-GeoTIFF files
SELECT path, gbx_rst_format(tile) as format
FROM rasters
WHERE gbx_rst_format(tile) != 'GTiff';
Example output
+------+-----+
|format|count|
+------+-----+
|GTiff |10 |
+------+-----+

rst_georeference

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio.

Signature: rst_georeference(tile: Column): Column — Georeference parameters as a map.

The result is a MapType with the following keys, corresponding to GDAL's 6-element geotransform:

KeyGeotransform indexMeaning
upperLeftXGT(0)X of the upper-left corner of the upper-left pixel
upperLeftYGT(3)Y of the upper-left corner of the upper-left pixel
scaleXGT(1)Pixel width (west–east resolution)
scaleYGT(5)Pixel height (north–south resolution; often negative for north-up)
skewXGT(2)Row rotation (typically 0)
skewYGT(4)Column rotation (typically 0)

See the GDAL geotransform tutorial and raster data model for details.

SQL:

SELECT gbx_rst_georeference(tile) as georeference FROM rasters;
Example output
+------------+
|georeference|
+------------+
|[ ... ] |
+------------+

rst_getnodata

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio. Returns the dataset NoData value repeated once per band.

Signature: rst_getnodata(tile: Column): Column — NoData values per band.

SQL:

SELECT
path,
gbx_rst_getnodata(tile) as nodata_values,
gbx_rst_getnodata(tile)[0] as band1_nodata
FROM rasters;
Example output
+----+-------------+------------+
|path|nodata_values|band1_nodata|
+----+-------------+------------+
|... |[-9999.0] |-9999.0 |
+----+-------------+------------+

rst_getsubdataset

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio. Subdataset availability depends on rasterio's bundled GDAL driver set.

Signature: rst_getsubdataset(tile: Column, subsetName: Column): Column — Extract subdataset.

SQL:

SELECT
path,
gbx_rst_getsubdataset(tile, 'temperature') as temp_layer
FROM netcdf_files;
Example output
+----+----------------------------------------------+
|path|temp_layer |
+----+----------------------------------------------+
|... |{null, <raster bytes>, {driver -> GTiff, ...}}|
+----+----------------------------------------------+

rst_height

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio.

Signature: rst_height(tile: Column): Column — Height in pixels.

SQL:

SELECT gbx_rst_height(tile) as height, gbx_rst_width(tile) as width FROM rasters;
Example output
+------+-----+
|height|width|
+------+-----+
|10980 |10980|
+------+-----+

rst_max

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio + NumPy. Per-band maximum over valid (non-NoData) pixels.

Signature: rst_max(tile: Column): Column — Maximum pixel values per band.

SQL:

SELECT path, gbx_rst_max(tile) as max_per_band, gbx_rst_max(tile)[0] as band1_max FROM rasters;
Example output
+----+------------+---------+
|path|max_per_band|band1_max|
+----+------------+---------+
|... |[255.0] |255.0 |
+----+------------+---------+

rst_median

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio + NumPy. Per-band median over valid (non-NoData) pixels.

Signature: rst_median(tile: Column): Column — Median pixel values per band.

SQL:

SELECT
path,
gbx_rst_avg(tile)[0] as mean_value,
gbx_rst_median(tile)[0] as median_value,
ABS(gbx_rst_avg(tile)[0] - gbx_rst_median(tile)[0]) as skewness
FROM rasters;
Example output
+----+----------+------------+--------+
|path|mean_value|median_value|skewness|
+----+----------+------------+--------+
|... |0.45 |0.42 |0.03 |
+----+----------+------------+--------+

rst_memsize

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio. Returns the serialized raster size in bytes.

Signature: rst_memsize(tile: Column): Column — In-memory size in bytes.

SQL:

SELECT path, gbx_rst_memsize(tile) as size_bytes FROM rasters;
Example output
+----+----------+
|path|size_bytes|
+----+----------+
|... |120560400 |
+----+----------+

rst_metadata

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio.

Signature: rst_metadata(tile: Column): Column — Metadata map.

SQL:

SELECT gbx_rst_metadata(tile) as metadata FROM rasters;
Example output
+--------+
|metadata|
+--------+
|{...} |
+--------+

rst_min

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio + NumPy. Per-band minimum over valid (non-NoData) pixels.

Signature: rst_min(tile: Column): Column — Minimum pixel values per band.

SQL:

SELECT path, gbx_rst_min(tile) as min_per_band, gbx_rst_min(tile)[0] as band1_min FROM rasters;
Example output
+----+------------+---------+
|path|min_per_band|band1_min|
+----+------------+---------+
|... |[0.0] |0.0 |
+----+------------+---------+

rst_numbands

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio.

Signature: rst_numbands(tile: Column): Column — Number of bands.

SQL:

SELECT gbx_rst_numbands(tile) as bands FROM rasters;
Example output
+-----+
|bands|
+-----+
|1 |
+-----+

rst_pixelcount

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio + NumPy. Per-band count of valid (non-NoData) pixels.

Signature: rst_pixelcount(tile: Column): Column — Total pixel count.

SQL:

SELECT gbx_rst_pixelcount(tile) as pixel_count FROM rasters;
Example output
+-----------+
|pixel_count|
+-----------+
|120560400 |
+-----------+

rst_pixelheight

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio.

Signature: rst_pixelheight(tile: Column): Column — Pixel height in ground units.

SQL:

SELECT
path,
gbx_rst_pixelwidth(tile) as pixel_width,
gbx_rst_pixelheight(tile) as pixel_height,
gbx_rst_width(tile) * gbx_rst_pixelwidth(tile) as total_width_m
FROM rasters;
Example output
+----+-----------+------------+-------------+
|path|pixel_width|pixel_height|total_width_m|
+----+-----------+------------+-------------+
|... |30.0 |-30.0 |329400.0 |
+----+-----------+------------+-------------+

rst_pixelwidth

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio.

Signature: rst_pixelwidth(tile: Column): Column — Pixel width in ground units.

SQL:

SELECT
path,
gbx_rst_pixelwidth(tile) as pixel_width,
gbx_rst_pixelheight(tile) as pixel_height,
gbx_rst_width(tile) * gbx_rst_pixelwidth(tile) as total_width_m
FROM rasters;
Example output
+----+-----------+------------+-------------+
|path|pixel_width|pixel_height|total_width_m|
+----+-----------+------------+-------------+
|... |30.0 |-30.0 |329400.0 |
+----+-----------+------------+-------------+

rst_rotation

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio.

Signature: rst_rotation(tile: Column): Column — Rotation in radians.

SQL:

SELECT path, gbx_rst_rotation(tile) as rotation_rad FROM rasters;
Example output
+----+------------+
|path|rotation_rad|
+----+------------+
|... |0.0 |
+----+------------+

rst_scalex / rst_scaley

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio.

Signature: rst_scalex(tile: Column): Column, rst_scaley(tile: Column): Column — Scale (pixel size) in X/Y.

SQL:

SELECT
path,
gbx_rst_scalex(tile) as scale_x,
gbx_rst_scaley(tile) as scale_y
FROM rasters;
Example output
+----+-------+-------+
|path|scale_x|scale_y|
+----+-------+-------+
|... |30.0 |-30.0 |
+----+-------+-------+

rst_skewx / rst_skewy

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio.

Signature: rst_skewx(tile: Column): Column, rst_skewy(tile: Column): Column — Skew in X/Y.

SQL:

SELECT
path,
gbx_rst_skewx(tile) as skew_x,
gbx_rst_skewy(tile) as skew_y
FROM rasters;
Example output
+----+------+------+
|path|skew_x|skew_y|
+----+------+------+
|... |0.0 |0.0 |
+----+------+------+

rst_srid

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio.

Signature: rst_srid(tile: Column): Column — Spatial reference ID (e.g. EPSG).

SQL:

SELECT gbx_rst_srid(tile) as srid FROM rasters;
Example output
+-----+
|srid |
+-----+
|32618|
+-----+

rst_subdatasets

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio. Empty for single-dataset rasters (e.g. a plain GeoTIFF).

Signature: rst_subdatasets(tile: Column): Column — List of subdataset names.

SQL:

SELECT path, gbx_rst_subdatasets(tile) as subdatasets FROM netcdf_rasters;
Example output
+----+-------------------+
|path|subdatasets |
+----+-------------------+
|... |[temp, precip, ...]|
+----+-------------------+

rst_summary

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio + NumPy. Returns a JSON summary (driver, size, CRS, geotransform, per-band statistics); the lightweight JSON shape differs from the heavyweight gdalinfo -json output.

Signature: rst_summary(tile: Column): Column — Statistical summary of values.

SQL:

SELECT path, gbx_rst_summary(tile) as summary FROM rasters;
Example output
+----+-------+
|path|summary|
+----+-------+
|... |{...} |
+----+-------+

rst_type

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio.

Signature: rst_type(tile: Column): Column — Data type per band.

SQL:

-- Get data types
SELECT
path,
gbx_rst_type(tile) as band_types,
gbx_rst_type(tile)[0] as band1_type
FROM rasters;

-- Group by data type
SELECT
gbx_rst_type(tile)[0] as data_type,
COUNT(*) as count
FROM rasters
GROUP BY gbx_rst_type(tile)[0];
Example output
+----+----------+----------+
|path|band_types|band1_type|
+----+----------+----------+
|... |[Byte] |Byte |
+----+----------+----------+

rst_upperleftx / rst_upperlefty

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio.

Signature: rst_upperleftx(tile: Column): Column, rst_upperlefty(tile: Column): Column — Upper-left corner coordinates.

SQL:

SELECT
path,
gbx_rst_upperleftx(tile) as upper_left_x,
gbx_rst_upperlefty(tile) as upper_left_y
FROM rasters;
Example output
+----+------------+------------+
|path|upper_left_x|upper_left_y|
+----+------------+------------+
|... |500000.0 |200000.0 |
+----+------------+------------+

rst_width

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio.

Signature: rst_width(tile: Column): Column — Width in pixels.

SQL:

SELECT gbx_rst_width(tile) as width FROM rasters;
Example output
+-----+
|width|
+-----+
|10980|
+-----+

Aggregator Functions

Combine or merge rasters in group-by (6 total).

rst_combineavg_agg

LightweightHeavyweight Grouped-agg UDF
Lightweight tier (pyrx)

Powered by rasterio + NumPy. Aggregate — groupBy(...).agg(rx.rst_combineavg_agg("tile")) returns the NoData-aware per-pixel mean as one tile per group; input tiles must share the same grid (shape/extent/CRS).

Lightweight SQL returns BINARY (not the tile struct)

Heavyweight gbx_rst_combineavg_agg returns a tile STRUCT<cellid, raster, metadata>; the lightweight SQL function returns BINARY (the raster bytes). A PySpark grouped-aggregate pandas_udf cannot return a StructType, so the lightweight SQL aggregate returns the raster payload as BINARY. The lightweight Python wrapper rx.rst_combineavg_agg(...) returns the full tile struct (it composes the aggregate with a tile-wrapping step), so only raw SQL differs. To rebuild the tile-struct equivalent in SQL, select the group key as cellid and wrap the BINARY with gbx_rst_fromcontent:

-- Lightweight SQL: rebuild the (cellid, raster) the heavyweight struct would carry
SELECT
group_key AS cellid,
gbx_rst_fromcontent(gbx_rst_combineavg_agg(tile), 'GTiff') AS tile
FROM tiles
GROUP BY group_key

Signature: rst_combineavg_agg(tile: Column): Column — Average tiles per group.

SQL:

-- Group by region and average
SELECT
region,
gbx_rst_combineavg_agg(tile) as regional_average
FROM rasters
GROUP BY region;
Example output
+------+----------------------------------------------+
|region|regional_average |
+------+----------------------------------------------+
|... |{null, <raster bytes>, {driver -> GTiff, ...}}|
+------+----------------------------------------------+

rst_derivedband_agg

LightweightHeavyweight Grouped-agg UDF
Lightweight tier (pyrx)

Powered by rasterio with GDAL VRT Python pixel functions. Aggregate — groupBy(...).agg(rx.rst_derivedband_agg("tile", pyfunc, funcName)) stacks the group's tiles as bands and applies your pixel function, returning one tile per group.

Lightweight SQL returns BINARY (not the tile struct)

Heavyweight gbx_rst_derivedband_agg returns a tile STRUCT<cellid, raster, metadata>; the lightweight SQL function returns BINARY (the raster bytes). A PySpark grouped-aggregate pandas_udf cannot return a StructType, so the lightweight SQL aggregate returns the raster payload as BINARY. The lightweight Python wrapper rx.rst_derivedband_agg(...) returns the full tile struct (it composes the aggregate with a tile-wrapping step), so only raw SQL differs. To rebuild the tile-struct equivalent in SQL, select the group key as cellid and wrap the BINARY with gbx_rst_fromcontent:

-- Lightweight SQL: rebuild the (cellid, raster) the heavyweight struct would carry
SELECT
group_key AS cellid,
gbx_rst_fromcontent(gbx_rst_derivedband_agg(tile, 'def f(a,b): return a+b', 'f'), 'GTiff') AS tile
FROM tiles
GROUP BY group_key

Signature: rst_derivedband_agg(tile: Column, pyfunc: String, funcName: String): Column — Apply Python UDF to tiles per group.

SQL:

SELECT region, gbx_rst_derivedband_agg(tile, 'def f(a): return a', 'f') as result FROM rasters GROUP BY region;
Example output
+------+----------------------------------------------+
|region|result |
+------+----------------------------------------------+
|... |{null, <raster bytes>, {driver -> GTiff, ...}}|
+------+----------------------------------------------+

rst_dtmfromgeoms_agg

LightweightHeavyweight Grouped-agg UDF
Lightweight tier (pyrx)

Powered by rasterio + SciPy (scipy.spatial.Delaunay). Aggregate — builds one TIN DTM tile per group from the group's Z-valued points via barycentric interpolation over an unconstrained Delaunay triangulation; breaklines, merge_tolerance, and snap_tolerance are accepted but not enforced (the heavyweight tier builds a constrained TIN).

Lightweight SQL returns BINARY (not the tile struct)

Heavyweight gbx_rst_dtmfromgeoms_agg returns a tile STRUCT<cellid, raster, metadata>; the lightweight SQL function returns BINARY (the raster bytes). A PySpark grouped-aggregate pandas_udf cannot return a StructType, so the lightweight SQL aggregate returns the raster payload as BINARY. The lightweight Python wrapper rx.rst_dtmfromgeoms_agg(...) returns the full tile struct (it composes the aggregate with a tile-wrapping step), so only raw SQL differs. To rebuild the tile-struct equivalent in SQL, select the group key as cellid and wrap the BINARY with gbx_rst_fromcontent:

-- Lightweight SQL: rebuild the (cellid, raster) the heavyweight struct would carry
SELECT
group_key AS cellid,
gbx_rst_fromcontent(
gbx_rst_dtmfromgeoms_agg(point, null, 0.0, 0.0, 0,0,10,10, 8,8, 32633),
'GTiff'
) AS tile
FROM observations
GROUP BY group_key

Streaming aggregator that accepts one Z-valued point WKB per row and produces a TIN/Delaunay DTM raster tile per group; breaklines are supplied as a per-group constant array to enforce hard terrain edges.

Signature: rst_dtmfromgeoms_agg(point: Column, breaklines: Column, mergeTolerance: Column, snapTolerance: Column, xmin: Column, ymin: Column, xmax: Column, ymax: Column, width: Column, height: Column, srid: Column): Column

Parameters: point — WKB point geometry with Z coordinate (one per row); breaklines — constant WKB array of breakline geometries per group (pass null or empty array if unused); remaining parameters match rst_dtmfromgeoms

SQL:

-- Stream survey points per region into one TIN DTM tile. Breaklines are a
-- per-group constant array; for 10 m cells over a 1000 m extent use 100 px.
SELECT region_id,
gbx_rst_dtmfromgeoms_agg(
point_wkb, breaklines_wkb_array,
0.0, 0.01,
bbox_xmin, bbox_ymin, bbox_xmax, bbox_ymax,
100, 100, 32633
) AS dtm
FROM survey_points
GROUP BY region_id;
Example output
+---------+----------------------------------------------+
|region_id|dtm |
+---------+----------------------------------------------+
|R-01 |{null, <raster bytes>, {driver -> GTiff, ...}}|
+---------+----------------------------------------------+

rst_frombands_agg

LightweightHeavyweight Grouped-agg UDF
Lightweight tier (pyrx)

Powered by rasterio. Aggregate — groupBy(...).agg(rx.rst_frombands_agg("tile", "band_index")) stacks the group's tiles into one multi-band tile ordered by ascending band_index.

Lightweight SQL returns BINARY (not the tile struct)

Heavyweight gbx_rst_frombands_agg returns a tile STRUCT<cellid, raster, metadata>; the lightweight SQL function returns BINARY (the raster bytes). A PySpark grouped-aggregate pandas_udf cannot return a StructType, so the lightweight SQL aggregate returns the raster payload as BINARY. The lightweight Python wrapper rx.rst_frombands_agg(...) returns the full tile struct (it composes the aggregate with a tile-wrapping step), so only raw SQL differs. To rebuild the tile-struct equivalent in SQL, select the group key as cellid and wrap the BINARY with gbx_rst_fromcontent:

-- Lightweight SQL: rebuild the (cellid, raster) the heavyweight struct would carry
SELECT
group_key AS cellid,
gbx_rst_fromcontent(gbx_rst_frombands_agg(tile, band_index), 'GTiff') AS tile
FROM bands
GROUP BY group_key

Streaming aggregator that collects ordered per-band tiles (one row per band) into a single multi-band raster tile per group; use when bands arrive as separate rows rather than a pre-built array.

Signature: rst_frombands_agg(tile: Column, bandIndex: Column): Column

Parameters: tile — Single-band raster tile; bandIndex — 1-based band position within the output raster

SQL:

-- Collect per-band tiles in acquisition order into one multi-band raster per scene.
SELECT scene_id,
gbx_rst_frombands_agg(tile, band_index) AS multi_band
FROM band_tiles
GROUP BY scene_id;
Example output
+--------+----------------------------------------------+
|scene_id|multi_band |
+--------+----------------------------------------------+
|S2A_001 |{null, <raster bytes>, {driver -> GTiff, ...}}|
+--------+----------------------------------------------+

rst_merge_agg

LightweightHeavyweight Grouped-agg UDF
Lightweight tier (pyrx)

Powered by rasterio (rasterio.merge). Aggregate — groupBy(...).agg(rx.rst_merge_agg("tile")) merges the group's tiles into one mosaic tile (output spans the union extent).

Lightweight SQL returns BINARY (not the tile struct)

Heavyweight gbx_rst_merge_agg returns a tile STRUCT<cellid, raster, metadata>; the lightweight SQL function returns BINARY (the raster bytes). A PySpark grouped-aggregate pandas_udf cannot return a StructType, so the lightweight SQL aggregate returns the raster payload as BINARY. The lightweight Python wrapper rx.rst_merge_agg(...) returns the full tile struct (it composes the aggregate with a tile-wrapping step), so only raw SQL differs. To rebuild the tile-struct equivalent in SQL, select the group key as cellid and wrap the BINARY with gbx_rst_fromcontent:

-- Lightweight SQL: rebuild the (cellid, raster) the heavyweight struct would carry
SELECT
group_key AS cellid,
gbx_rst_fromcontent(gbx_rst_merge_agg(tile), 'GTiff') AS tile
FROM tiles
GROUP BY group_key

Signature: rst_merge_agg(tile: Column): Column — Merge tiles per group.

SQL:

SELECT
scene_id,
gbx_rst_merge_agg(tile) as merged_scene
FROM satellite_tiles
GROUP BY scene_id;
Example output
+--------+----------------------------------------------+
|scene_id|merged_scene |
+--------+----------------------------------------------+
|S2A_001 |{null, <raster bytes>, {driver -> GTiff, ...}}|
+--------+----------------------------------------------+

rst_rasterize_agg

LightweightHeavyweight Grouped-agg UDF
Lightweight tier (pyrx)

Powered by rasterio (rasterio.features). Aggregate — burns the group's (geom, value) rows into one tile over the given extent/size/SRID (last-wins on overlap).

Lightweight SQL returns BINARY (not the tile struct)

Heavyweight gbx_rst_rasterize_agg returns a tile STRUCT<cellid, raster, metadata>; the lightweight SQL function returns BINARY (the raster bytes). A PySpark grouped-aggregate pandas_udf cannot return a StructType, so the lightweight SQL aggregate returns the raster payload as BINARY. The lightweight Python wrapper rx.rst_rasterize_agg(...) returns the full tile struct (it composes the aggregate with a tile-wrapping step), so only raw SQL differs. To rebuild the tile-struct equivalent in SQL, select the group key as cellid and wrap the BINARY with gbx_rst_fromcontent:

-- Lightweight SQL: rebuild the (cellid, raster) the heavyweight struct would carry
SELECT
group_key AS cellid,
gbx_rst_fromcontent(
gbx_rst_rasterize_agg(geom, value, 0,0,10,10, 8,8, 32633),
'GTiff'
) AS tile
FROM features
GROUP BY group_key

Streaming aggregator that burns geometry/value pairs (one row per feature) into a single rasterized tile per group; use when features arrive as individual rows rather than as a pre-built collection.

Signature: rst_rasterize_agg(geom: Column, value: Column, xmin: Column, ymin: Column, xmax: Column, ymax: Column, width: Column, height: Column, srid: Column): Column

Parameters: geom — WKB geometry to burn; value — numeric burn value; xmin/ymin/xmax/ymax — output extent (in the target CRS); width/height — output raster dimensions in pixels; srid — EPSG code for the output CRS

SQL:

-- Aggregate per-feature burn values into one rasterized tile per region.
SELECT region_id,
gbx_rst_rasterize_agg(
geom_wkb, burn_value,
bbox_xmin, bbox_ymin, bbox_xmax, bbox_ymax,
256, 256, 4326
) AS tile
FROM features
GROUP BY region_id;
Example output
+---------+----------------------------------------------+
|region_id|tile |
+---------+----------------------------------------------+
|R-01 |{null, <raster bytes>, {driver -> GTiff, ...}}|
+---------+----------------------------------------------+

rst_gridfrompoints_agg

LightweightHeavyweight Grouped-agg UDF
Lightweight tier (pyrx)

Powered by rasterio + SciPy (cKDTree IDW). Aggregate — groupBy(...).agg(rx.rst_gridfrompoints_agg(...)) inverse-distance-interpolates the group's points into one Float64 grid tile (NoData −9999).

Lightweight SQL returns BINARY (not the tile struct)

Heavyweight gbx_rst_gridfrompoints_agg returns a tile STRUCT<cellid, raster, metadata>; the lightweight SQL function returns BINARY (the raster bytes). A PySpark grouped-aggregate pandas_udf cannot return a StructType, so the lightweight SQL aggregate returns the raster payload as BINARY. The lightweight Python wrapper rx.rst_gridfrompoints_agg(...) returns the full tile struct (it composes the aggregate with a tile-wrapping step), so only raw SQL differs. To rebuild the tile-struct equivalent in SQL, select the group key as cellid and wrap the BINARY with gbx_rst_fromcontent:

-- Lightweight SQL: rebuild the (cellid, raster) the heavyweight struct would carry
SELECT
group_key AS cellid,
gbx_rst_fromcontent(
gbx_rst_gridfrompoints_agg(point, value, 0,0,10,10, 8,8, 32633, 2.0, 12),
'GTiff'
) AS tile
FROM observations
GROUP BY group_key

Streaming IDW-interpolation aggregator that accepts one point geometry and one scalar value per row and produces a Float64 GeoTIFF tile per group; use when observations arrive one per row rather than as pre-built arrays.

Signature: rst_gridfrompoints_agg(point: Column, value: Column, xmin: Column, ymin: Column, xmax: Column, ymax: Column, widthPx: Column, heightPx: Column, srid: Column, power: Column, maxPts: Column): Column

Parameters: point — WKB point geometry (one per row); value — scalar observation for the point; xmin/ymin/xmax/ymax — output extent in CRS units (constant per group); widthPx/heightPx — output dimensions in pixels; srid — EPSG code; power — IDW distance-decay exponent (2.0 is standard); maxPts — maximum nearest neighbours considered per output pixel

SQL:

-- Aggregate per-station observations into one IDW tile per region.
SELECT region_id,
gbx_rst_gridfrompoints_agg(
station_wkb, observation,
bbox_xmin, bbox_ymin, bbox_xmax, bbox_ymax,
256, 256, 32633
) AS idw
FROM observations
GROUP BY region_id;
Example output
+---------+----------------------------------------------+
|region_id|idw |
+---------+----------------------------------------------+
|R-01 |{null, <raster bytes>, {driver -> GTiff, ...}}|
+---------+----------------------------------------------+

Constructor Functions

Create or load rasters from path, binary content, or bands (4 total).

rst_fromfile

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio. Opens the raster at path and re-encodes it as a GeoTIFF tile; the driver arg is a format hint (rasterio auto-detects on open). A missing/unreadable path returns null.

Load a raster from a file path.

Signature: rst_fromfile(path: Column, driver: Column): Column

Parameters: path — File path; driver — GDAL driver name (e.g. GTiff)

Returns: Binary raster tile data

SQL:

-- Load from path
SELECT
gbx_rst_fromfile('/data/raster.tif', 'GTiff') as tile;

-- Load multiple and get properties
SELECT
path,
gbx_rst_width(gbx_rst_fromfile(path, 'GTiff')) as width,
gbx_rst_height(gbx_rst_fromfile(path, 'GTiff')) as height
FROM raster_paths;
Example output
+----------------------------------------------+
|tile |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

+----+-----+------+
|path|width|height|
+----+-----+------+
|... |10980|10980 |
+----+-----+------+

rst_fromcontent

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio. The driver defaults to GTiff when unspecified and outputs are re-encoded as GeoTIFF; readable input formats are limited to the GDAL build bundled with rasterio.

Create a raster from binary content.

Signature: rst_fromcontent(content: Column, driver: Column): Column

Parameters: content — Binary column; driver — GDAL driver name

Returns: Binary raster tile data

SQL:

-- Load from binary table
SELECT
path,
gbx_rst_fromcontent(content, 'GTiff') as tile
FROM binary_raster_table;
Example output
+----+----------------------------------------------+
|path|tile |
+----+----------------------------------------------+
|... |{null, <raster bytes>, {driver -> GTiff, ...}}|
+----+----------------------------------------------+

rst_dtmfromgeoms

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio + SciPy (scipy.spatial.Delaunay). Barycentric interpolation over an unconstrained Delaunay TIN; cells outside the convex hull are NoData. breaklines, merge_tolerance, and snap_tolerance are accepted for signature parity but not enforced — the heavyweight tier builds a constrained TIN that honors them.

Create a DTM raster tile via TIN/Delaunay interpolation from an array of Z-valued point WKB geometries, with an optional array of breakline WKB geometries to preserve sharp terrain transitions.

Signature: rst_dtmfromgeoms(points: Column, breaklines: Column, mergeTolerance: Column, snapTolerance: Column, xmin: Column, ymin: Column, xmax: Column, ymax: Column, width: Column, height: Column, srid: Column): Column

Parameters: points — Array of WKB point geometries with Z coordinates; breaklines — Array of WKB line/polygon geometries enforcing hard edges (pass null or empty array if unused); mergeTolerance/snapTolerance — Delaunay triangulation tolerances (vertex-merge distance and snapping distance; small values such as 0.0 and 0.01 are typical); xmin/ymin/xmax/ymax — output extent in CRS units; width/height — output raster dimensions in pixels (for N-metre cells set width = round((xmax-xmin)/N)); srid — EPSG code for the output CRS. An optional trailing noData argument overrides the default fill for cells outside the triangulated hull.

SQL:

-- TIN interpolation from arrays of Z-valued point WKB and breakline WKB.
-- Output is a 100 x 100 Float64 GTiff over the extent. For N-metre cells set
-- width_px = round((xmax-xmin)/N): here a 1000 m extent at 10 m cells -> 100 px.
SELECT gbx_rst_dtmfromgeoms(
points_wkb_array, breaklines_wkb_array,
0.0, 0.01,
0.0, 0.0, 1000.0, 1000.0,
100, 100, 32633
) AS dtm
FROM survey_points;
Example output
+----------------------------------------------+
|dtm |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_frombands

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio. Stacks an ARRAY of single-band tiles into one multi-band tile in array order (element 0 → band 1), preserving georeference/CRS/dtype/NoData from the first.

Create a raster from an array of band tiles.

Signature: rst_frombands(bands: Column): Column

SQL:

SELECT
gbx_rst_frombands(array(band1, band2, band3)) as multi_band
FROM separated_bands;
Example output
+----------------------------------------------+
|multi_band |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_gridfrompoints

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio + SciPy (cKDTree IDW). Inverse-distance interpolation (power, max_pts) to a single-band Float64 grid; NoData −9999. Matches the heavyweight invdist defaults (power=2.0, max_pts=12).

IDW-interpolate an array of Z-valued point geometries to a Float64 GeoTIFF tile covering an explicit bounding box and pixel grid. Supply the points and their scalar values as arrays in a single row; use rst_gridfrompoints_agg when points arrive one per row.

Signature: rst_gridfrompoints(points: Column, values: Column, xmin: Column, ymin: Column, xmax: Column, ymax: Column, widthPx: Column, heightPx: Column, srid: Column, power: Column, maxPts: Column): Column

Parameters: pointsARRAY<BINARY> of WKB point geometries; valuesARRAY<DOUBLE> of scalar observations, one per point; xmin/ymin/xmax/ymax — output extent in CRS units; widthPx/heightPx — output dimensions in pixels; srid — EPSG code; power — IDW distance-decay exponent (2.0 is the standard); maxPts — maximum nearest neighbours considered per output pixel

SQL:

-- IDW (power=2, max_points=12) from arrays of point WKB and values.
-- Output is a 256 x 256 Float64 GTiff covering the requested extent.
SELECT gbx_rst_gridfrompoints(
points_wkb_array, values_array,
0.0, 0.0, 1000.0, 1000.0,
256, 256, 32633
) AS idw
FROM point_clouds;
Example output
+----------------------------------------------+
|idw |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

Generator Functions

Produce multiple tiles or bands (5 total).

rst_h3_tessellate

LightweightHeavyweight Streaming UDTF
Lightweight tier (pyrx)

Powered by rasterio + h3. Clips the raster to each overlapping H3 cell at resolution, returning one row per cell via streaming UDTF (matches the heavyweight generator behavior).

Signature: rst_h3_tessellate(tile: Column, resolution: Column, mode: Column = "covering"): Column — Tessellate raster to H3 cells. mode is "covering" (default — every overlapping cell, clipped) or "centroid" (pixel-centroid single-assignment partition). See H3 Raster Tessellation for the full mode guide.

SQL:

-- covering mode (default): every overlapping H3 cell, clipped to its hexagon
SELECT t.*
FROM rasters,
LATERAL gbx_rst_h3_tessellate(tile, 7, 'covering') t;

-- centroid mode: pixel-centroid single-assignment partition (no double-count)
SELECT t.*
FROM rasters,
LATERAL gbx_rst_h3_tessellate(tile, 7, 'centroid') t;

-- backward-compatible two-argument form (covering)
SELECT t.*
FROM rasters,
LATERAL gbx_rst_h3_tessellate(tile, 7) t;
Example output
+------+------------------+--------------+
|source|cellid |raster |
+------+------------------+--------------+
|... |599686042433355775|<raster bytes>|
+------+------------------+--------------+

rst_maketiles

LightweightHeavyweight Streaming UDTF
Lightweight tier (pyrx)

Powered by rasterio. Streams one tile row per subdivided region via streaming UDTF. It derives a square tile size from the MB budget and always partitions; it does not honor the heavyweight size_in_mb = -1 (single tile) or 0 (64 MB) sentinels or the power-of-four split, so tile counts and dimensions differ.

Signature: rst_maketiles(tile: Column, tileWidth: Column, tileHeight: Column): Column — Subdivide into smaller tiles.

SQL:

-- Subdivide and explode tiles
SELECT
path,
tile_subtile as tile
FROM rasters
LATERAL VIEW explode(gbx_rst_maketiles(tile, 512, 512)) AS tile_subtile;

-- Count tiles per raster
SELECT
path,
SIZE(gbx_rst_maketiles(tile, 512, 512)) as num_tiles
FROM rasters;
Example output
+----+----------------------------------------------+
|path|tile |
+----+----------------------------------------------+
|... |{null, <raster bytes>, {driver -> GTiff, ...}}|
+----+----------------------------------------------+

+----+---------+
|path|num_tiles|
+----+---------+
|... |42 |
+----+---------+

rst_retile

LightweightHeavyweight Streaming UDTF
Lightweight tier (pyrx)

Powered by rasterio. Streams one tile row per retiled region via streaming UDTF (matches the heavyweight generator behavior).

Signature: rst_retile(tile: Column, tileWidth: Column, tileHeight: Column): Column — Retile to uniform dimensions.

SQL:

SELECT
path,
tile
FROM rasters
LATERAL VIEW explode(gbx_rst_retile(tile, 256, 256)) AS tile;
Example output
+----+----------------------------------------------+
|path|tile |
+----+----------------------------------------------+
|... |{null, <raster bytes>, {driver -> GTiff, ...}}|
+----+----------------------------------------------+

rst_separatebands

LightweightHeavyweight Streaming UDTF
Lightweight tier (pyrx)

Powered by rasterio. Streams one band-tile row per band via streaming UDTF — O(1) worker memory regardless of band count (matches the heavyweight generator behavior).

Signature: rst_separatebands(tile: Column): Column — Split multi-band into array of bands.

SQL:

SELECT
path,
bands[0] as red_band,
bands[1] as green_band,
bands[2] as blue_band
FROM (
SELECT path, gbx_rst_separatebands(tile) as bands
FROM rgb_rasters
);
Example output
+----+----------------------------------------------+----------------------------------------------+----------------------------------------------+
|path|red_band |green_band |blue_band |
+----+----------------------------------------------+----------------------------------------------+----------------------------------------------+
|... |{null, <raster bytes>, {driver -> GTiff, ...}}|{null, <raster bytes>, {driver -> GTiff, ...}}|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----+----------------------------------------------+----------------------------------------------+----------------------------------------------+

rst_tooverlappingtiles

LightweightHeavyweight Streaming UDTF
Lightweight tier (pyrx)

Powered by rasterio. Streams one tile row per overlapping region via streaming UDTF (matches the heavyweight generator behavior).

Signature: rst_tooverlappingtiles(tile: Column, tileWidth: Column, tileHeight: Column, overlap: Column): Column — Create overlapping tiles.

SQL:

SELECT
path,
tile
FROM rasters
LATERAL VIEW explode(gbx_rst_tooverlappingtiles(tile, 256, 256, 10)) AS tile;
Example output
+----+----------------------------------------------+
|path|tile |
+----+----------------------------------------------+
|... |{null, <raster bytes>, {driver -> GTiff, ...}}|
+----+----------------------------------------------+

Grid Functions (H3)

Aggregate raster values to H3 grid cells (5 total).

rst_h3_rastertogridavg

LightweightHeavyweight Streaming UDTF
Lightweight tier (pyrx)

Powered by rasterio + h3. Returns an ARRAY (one element per band) of ARRAY<struct(cellID, measure)>. The raster is interpreted as EPSG:4326 lon/lat — reproject upstream with rst_transform if your source CRS differs.

Signature: rst_h3_rastertogridavg(tile: Column, resolution: Column): Column

SQL:

-- Aggregate raster to H3 grid
SELECT
path,
gbx_rst_h3_rastertogridavg(tile, 6) as h3_grid
FROM rasters;

-- Get cells from first band
SELECT
path,
cell.cellID as h3_cell,
cell.measure as avg_value
FROM rasters
LATERAL VIEW explode(gbx_rst_h3_rastertogridavg(tile, 6)[0]) AS cell;
Example output
+----+------------------------------+
|path|h3_grid |
+----+------------------------------+
|... |[[{599686042433355775, 0.42}]]|
+----+------------------------------+

+----+--------+---------+
|path|h3_cell |avg_value|
+----+--------+---------+
|... |8f283...|0.45 |
+----+--------+---------+

rst_h3_rastertogridcount

LightweightHeavyweight Streaming UDTF
Lightweight tier (pyrx)

Powered by rasterio + h3. Returns an ARRAY (one element per band) of ARRAY<struct(cellID, measure)>, where measure is the per-cell pixel count (integer). The raster is interpreted as EPSG:4326 lon/lat — reproject upstream with rst_transform if your source CRS differs.

Signature: rst_h3_rastertogridcount(tile: Column, resolution: Column): Column — Pixel count per H3 cell.

SQL:

SELECT
gbx_rst_h3_rastertogridcount(tile, 5) as pixel_counts
FROM rasters;
Example output
+------------------------------+
|pixel_counts |
+------------------------------+
|[[{599686042433355775, 1024}]]|
+------------------------------+

rst_h3_rastertogridmax

LightweightHeavyweight Streaming UDTF
Lightweight tier (pyrx)

Powered by rasterio + h3. Returns an ARRAY (one element per band) of ARRAY<struct(cellID, measure)>. The raster is interpreted as EPSG:4326 lon/lat — reproject upstream with rst_transform if your source CRS differs.

Signature: rst_h3_rastertogridmax(tile: Column, resolution: Column): Column — Max value per H3 cell.

SQL:

SELECT
cell.cellID as h3_cell,
cell.measure as max_value
FROM rasters
LATERAL VIEW explode(gbx_rst_h3_rastertogridmax(tile, 7)[0]) AS cell;
Example output
+--------+---------+
|h3_cell |max_value|
+--------+---------+
|8f283...|255.0 |
+--------+---------+

rst_h3_rastertogridmin

LightweightHeavyweight Streaming UDTF
Lightweight tier (pyrx)

Powered by rasterio + h3. Returns an ARRAY (one element per band) of ARRAY<struct(cellID, measure)>. The raster is interpreted as EPSG:4326 lon/lat — reproject upstream with rst_transform if your source CRS differs.

Signature: rst_h3_rastertogridmin(tile: Column, resolution: Column): Column — Min value per H3 cell.

SQL:

SELECT
cell.cellID as h3_cell,
cell.measure as min_value
FROM rasters
LATERAL VIEW explode(gbx_rst_h3_rastertogridmin(tile, 7)[0]) AS cell;
Example output
+--------+---------+
|h3_cell |min_value|
+--------+---------+
|8f283...|0.0 |
+--------+---------+

rst_h3_rastertogridmedian

LightweightHeavyweight Streaming UDTF
Lightweight tier (pyrx)

Powered by rasterio + h3. Returns an ARRAY (one element per band) of ARRAY<struct(cellID, measure)>. The raster is interpreted as EPSG:4326 lon/lat — reproject upstream with rst_transform if your source CRS differs.

Signature: rst_h3_rastertogridmedian(tile: Column, resolution: Column): Column — Median value per H3 cell.

SQL:

SELECT
cell.cellID as h3_cell,
cell.measure as median_value
FROM rasters
LATERAL VIEW explode(gbx_rst_h3_rastertogridmedian(tile, 7)[0]) AS cell;
Example output
+--------+------------+
|h3_cell |median_value|
+--------+------------+
|8f283...|128.0 |
+--------+------------+

Grid Functions (quadbin)

Aggregate raster values to CARTO quadbin v0 grid cells. Each function returns an array (one entry per band) of struct<cellID: BIGINT, measure: DOUBLE> rows; explode the array element you want to drive per-cell rows. Resolution is the quadbin zoom (0..26).

rst_quadbin_rastertogridavg

LightweightHeavyweight Streaming UDTF
Lightweight tier (pyrx)

Powered by rasterio + quadbin. Returns an ARRAY (one element per band) of ARRAY<struct(cellID, measure)>. The raster is interpreted as EPSG:4326 lon/lat — reproject upstream with rst_transform if your source CRS differs.

Signature: rst_quadbin_rastertogridavg(tile: Column, resolution: Column): Column — Mean pixel value per quadbin cell.

SQL:

-- Aggregate raster to quadbin grid
SELECT
path,
gbx_rst_quadbin_rastertogridavg(tile, 6) as quadbin_grid
FROM rasters;

-- Get cells from the first band
SELECT
path,
cell.cellID as quadbin_cell,
cell.measure as avg_value
FROM rasters
LATERAL VIEW explode(gbx_rst_quadbin_rastertogridavg(tile, 6)[0]) AS cell;
Example output
+----+-------------------------------+
|path|quadbin_grid |
+----+-------------------------------+
|... |[[{5188146770730811391, 0.42}]]|
+----+-------------------------------+

+----+------------+---------+
|path|quadbin_cell|avg_value|
+----+------------+---------+
|... |5188146... |0.45 |
+----+------------+---------+

rst_quadbin_rastertogridcount

LightweightHeavyweight Streaming UDTF
Lightweight tier (pyrx)

Powered by rasterio + quadbin. Returns an ARRAY (one element per band) of ARRAY<struct(cellID, measure)>, where measure is the per-cell pixel count (integer). The raster is interpreted as EPSG:4326 lon/lat — reproject upstream with rst_transform if your source CRS differs.

Signature: rst_quadbin_rastertogridcount(tile: Column, resolution: Column): Column — Pixel count per quadbin cell.

SQL:

SELECT
gbx_rst_quadbin_rastertogridcount(tile, 5) as pixel_counts
FROM rasters;
Example output
+-------------------------------+
|pixel_counts |
+-------------------------------+
|[[{5188146770730811391, 1024}]]|
+-------------------------------+

rst_quadbin_rastertogridmax

LightweightHeavyweight Streaming UDTF
Lightweight tier (pyrx)

Powered by rasterio + quadbin. Returns an ARRAY (one element per band) of ARRAY<struct(cellID, measure)>. The raster is interpreted as EPSG:4326 lon/lat — reproject upstream with rst_transform if your source CRS differs.

Signature: rst_quadbin_rastertogridmax(tile: Column, resolution: Column): Column — Max pixel value per quadbin cell.

SQL:

SELECT
cell.cellID as quadbin_cell,
cell.measure as max_value
FROM rasters
LATERAL VIEW explode(gbx_rst_quadbin_rastertogridmax(tile, 7)[0]) AS cell;
Example output
+------------+---------+
|quadbin_cell|max_value|
+------------+---------+
|5188146... |255.0 |
+------------+---------+

rst_quadbin_rastertogridmin

LightweightHeavyweight Streaming UDTF
Lightweight tier (pyrx)

Powered by rasterio + quadbin. Returns an ARRAY (one element per band) of ARRAY<struct(cellID, measure)>. The raster is interpreted as EPSG:4326 lon/lat — reproject upstream with rst_transform if your source CRS differs.

Signature: rst_quadbin_rastertogridmin(tile: Column, resolution: Column): Column — Min pixel value per quadbin cell.

SQL:

SELECT
cell.cellID as quadbin_cell,
cell.measure as min_value
FROM rasters
LATERAL VIEW explode(gbx_rst_quadbin_rastertogridmin(tile, 7)[0]) AS cell;
Example output
+------------+---------+
|quadbin_cell|min_value|
+------------+---------+
|5188146... |0.0 |
+------------+---------+

rst_quadbin_rastertogridmedian

LightweightHeavyweight Streaming UDTF
Lightweight tier (pyrx)

Powered by rasterio + quadbin. Returns an ARRAY (one element per band) of ARRAY<struct(cellID, measure)>. The raster is interpreted as EPSG:4326 lon/lat — reproject upstream with rst_transform if your source CRS differs.

Signature: rst_quadbin_rastertogridmedian(tile: Column, resolution: Column): Column — Median pixel value per quadbin cell.

SQL:

SELECT
cell.cellID as quadbin_cell,
cell.measure as median_value
FROM rasters
LATERAL VIEW explode(gbx_rst_quadbin_rastertogridmedian(tile, 7)[0]) AS cell;
Example output
+------------+------------+
|quadbin_cell|median_value|
+------------+------------+
|5188146... |128.0 |
+------------+------------+

Operations

Transform and analyze rasters (20 total).

rst_asformat

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio. Output formats are limited to rasterio's bundled-GDAL writable driver set; the tile is re-encoded in the requested format.

Signature: rst_asformat(tile: Column, newFormat: Column): Column — Convert to another format.

SQL:

-- Convert NetCDF to GeoTIFF
SELECT
path,
gbx_rst_asformat(tile, 'GTiff') as geotiff_tile
FROM netcdf_rasters;

-- Convert to PNG
SELECT
path,
gbx_rst_asformat(tile, 'PNG') as png_tile
FROM visualization_tiles;
Example output
+----+----------------------------------------------+
|path|geotiff_tile |
+----+----------------------------------------------+
|... |{null, <raster bytes>, {driver -> GTiff, ...}}|
+----+----------------------------------------------+

rst_clip

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio (rasterio.mask). The clip geometry is assumed to be in the raster's CRS; the heavyweight tier has additional SRID-inheritance fallbacks.

Signature: rst_clip(tile: Column, clip: Column, cutlineAllTouched: Column): Column — Clip by geometry. The clip argument must be WKT (string), EWKT (SRID-prefixed string), WKB (binary), or EWKB (SRID-embedded binary); do not use st_geomfromtext() or other DBR native geometry.

CRS handling:

  • EWKT (SRID=4326;POLYGON(...)) or EWKB (SRID encoded in the byte header) — the SRID is read, and if it differs from the raster's CRS the cutline is reprojected before clipping. Use this form whenever the geometry and raster may be in different CRSs.
  • Plain WKT / WKB (no SRID) — the geometry is assumed to already be in the raster's CRS. If that assumption is wrong (for example, lon/lat polygons against a UTM raster), the cutline will land outside the raster and you'll get an empty or blank output. Either switch to EWKT/EWKB, or reproject the geometry to the raster's CRS first.

SQL:

-- Clip with WKT geometry
SELECT
path,
gbx_rst_clip(
tile,
'POLYGON((-122 37, -122 38, -121 38, -121 37, -122 37))',
true
) as clipped
FROM rasters;
Example output
+----+----------------------------------------------+
|path|clipped |
+----+----------------------------------------------+
|... |{null, <raster bytes>, {driver -> GTiff, ...}}|
+----+----------------------------------------------+

rst_combineavg

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio + NumPy. Takes an ARRAY<tile> and returns the NoData-aware per-pixel mean; input tiles must share the same grid (shape/extent/CRS). cellid is preserved when all inputs share one, else −1.

Signature: rst_combineavg(tiles: Column): Column — Average multiple tiles (e.g. temporal composite).

SQL:

-- Average rasters for temporal composite
WITH loaded_tiles AS (
SELECT
date_trunc('week', date) as week,
gbx_rst_fromfile(path, 'GTiff') as tile
FROM daily_rasters
WHERE date >= '2024-01-01'
)
SELECT
week,
gbx_rst_combineavg(collect_list(tile)) as weekly_composite
FROM loaded_tiles
GROUP BY week;
Example output
+-------------------+----------------------------------------------+
|week |weekly_composite |
+-------------------+----------------------------------------------+
|2024-01-01 00:00:00|{null, <raster bytes>, {driver -> GTiff, ...}}|
+-------------------+----------------------------------------------+

rst_convolve

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by SciPy (scipy.ndimage). Output is Float64 and border pixels are filled by edge-replication; the heavyweight tier preserves the input dtype and leaves border pixels unchanged.

Signature: rst_convolve(tile: Column, kernel: Column): Column — Apply convolution kernel.

SQL:

-- Apply 3x3 kernel (e.g. blur); kernel format is driver-specific
SELECT path, gbx_rst_convolve(tile, kernel) as filtered FROM rasters_with_kernels;
Example output
+----+----------------------------------------------+
|path|filtered |
+----+----------------------------------------------+
|... |{null, <raster bytes>, {driver -> GTiff, ...}}|
+----+----------------------------------------------+

rst_derivedband

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio with GDAL VRT Python pixel functions. A pixel function authored for one tier runs unchanged in the other.

Signature: rst_derivedband(tile: Column, pyfunc: String, funcName: String): Column — Apply Python UDF to derive band.

SQL:

-- Apply custom Python function to raster band; requires registered UDF
SELECT path, gbx_rst_derivedband(tile, 'def my_func(arr): return arr * 2', 'my_func') as derived FROM rasters;
Example output
+----+----------------------------------------------+
|path|derived |
+----+----------------------------------------------+
|... |{null, <raster bytes>, {driver -> GTiff, ...}}|
+----+----------------------------------------------+

rst_filter

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by SciPy (scipy.ndimage). The averaging filter is named 'mean' (not 'avg') and 'mode' is unavailable; the averaging output is Float32 and near-edge values may differ slightly from the heavyweight tier.

Signature: rst_filter(tile: Column, kernelSize: Column, operation: Column): Column — Spatial filter (e.g. median, avg).

SQL:

-- Median filter (3x3 window)
SELECT
path,
gbx_rst_filter(tile, 3, 'median') as denoised
FROM noisy_rasters;

-- Average smoothing (5x5 window)
SELECT
path,
gbx_rst_filter(tile, 5, 'avg') as smoothed
FROM rasters;
Example output
+----+----------------------------------------------+
|path|denoised |
+----+----------------------------------------------+
|... |{null, <raster bytes>, {driver -> GTiff, ...}}|
+----+----------------------------------------------+

rst_initnodata

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio. When no NoData is set it assigns -9999.0; the heavyweight tier assigns a data-type-appropriate sentinel per band, so the NoData value can differ for integer or byte rasters.

Signature: rst_initnodata(tile: Column): Column — Initialize NoData values.

SQL:

SELECT gbx_rst_initnodata(tile) as tile FROM rasters;
Example output
+----------------------------------------------+
|tile |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_isempty

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio.

Signature: rst_isempty(tile: Column): Column — Check if raster is empty.

SQL:

-- Filter out empty rasters
SELECT * FROM rasters
WHERE NOT gbx_rst_isempty(tile);

-- Count empty vs valid
SELECT
COUNT(*) as total,
SUM(CASE WHEN gbx_rst_isempty(tile) THEN 1 ELSE 0 END) as empty_count,
SUM(CASE WHEN NOT gbx_rst_isempty(tile) THEN 1 ELSE 0 END) as valid_count
FROM rasters;
Example output
+-----+-----------+-----------+
|total|empty_count|valid_count|
+-----+-----------+-----------+
|100 |0 |100 |
+-----+-----------+-----------+

rst_mapalgebra

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by NumExpr. Bands map to A, B, C, … and the expression is evaluated with NumExpr (no gdal_calc NumPy builtins); single-band Float32 output.

Signature: rst_mapalgebra(tiles: Column, expression: Column): Column — Map algebra expression (e.g. A-B).

SQL:

-- Calculate difference between two rasters
SELECT
gbx_rst_mapalgebra(
tiles,
'{"calc": "A-B", "A_index": 0, "B_index": 1}'
) as difference
FROM raster_arrays;
Example output
+----------------------------------------------+
|difference |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_merge

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio (rasterio.merge). Takes an ARRAY<tile> (in one row) and mosaics them into a single tile spanning the union extent (first-tile-wins on overlap, in array order).

Signature: rst_merge(tiles: Column): Column — Merge tiles into mosaic.

SQL:

-- Merge rasters from a table
WITH loaded_tiles AS (
SELECT
id,
gbx_rst_fromfile(path, 'GTiff') as tile
FROM raster_paths
)
SELECT gbx_rst_merge(collect_list(tile)) as merged_mosaic
FROM loaded_tiles;
Example output
+----------------------------------------------+
|merged_mosaic |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_ndvi

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio + NumPy. Valid pixels match the heavyweight tier; pixels with a zero denominator are set to NoData (-9999) rather than left as non-finite values.

Signature: rst_ndvi(tile: Column, redBand: Column, nirBand: Column): Column — NDVI from band indices.

SQL:

-- Calculate NDVI for Sentinel-2 imagery
SELECT
path,
date,
gbx_rst_ndvi(tile, 4, 8) as ndvi_tile,
gbx_rst_avg(gbx_rst_ndvi(tile, 4, 8))[0] as mean_ndvi
FROM sentinel2_images;

-- Monthly vegetation trends
SELECT
date_trunc('month', date) as month,
AVG(gbx_rst_avg(gbx_rst_ndvi(tile, 4, 8))[0]) as avg_monthly_ndvi
FROM sentinel2_images
GROUP BY date_trunc('month', date)
ORDER BY month;
Example output
+----+----------+----------------------------------------------+---------+
|path|date |ndvi_tile |mean_ndvi|
+----+----------+----------------------------------------------+---------+
|... |2024-01-15|{null, <raster bytes>, {driver -> GTiff, ...}}|0.42 |
+----+----------+----------------------------------------------+---------+

rst_rastertoworldcoord

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio.

Signature: rst_rastertoworldcoord(tile: Column, pixelX: Column, pixelY: Column): Column — Pixel to world coordinates as a struct with .x and .y fields.

SQL:

SELECT
path,
gbx_rst_rastertoworldcoord(tile, 100, 200) as coords,
gbx_rst_rastertoworldcoord(tile, 100, 200).x as longitude,
gbx_rst_rastertoworldcoord(tile, 100, 200).y as latitude
FROM rasters;
Example output
+----+----------+---------+--------+
|path|coords |longitude|latitude|
+----+----------+---------+--------+
|... |POINT(...)|-74.0 |40.5 |
+----+----------+---------+--------+

rst_rastertoworldcoordx / rst_rastertoworldcoordy

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio.

Signature: rst_rastertoworldcoordx(tile: Column, pixelX: Column, pixelY: Column): Column, rst_rastertoworldcoordy(tile: Column, pixelX: Column, pixelY: Column): Column — World X / Y coordinate of a pixel.

SQL:

SELECT
path,
gbx_rst_rastertoworldcoord(tile, 100, 200) as coords,
gbx_rst_rastertoworldcoord(tile, 100, 200).x as longitude,
gbx_rst_rastertoworldcoord(tile, 100, 200).y as latitude
FROM rasters;
Example output
+----+----------+---------+--------+
|path|coords |longitude|latitude|
+----+----------+---------+--------+
|... |POINT(...)|-74.0 |40.5 |
+----+----------+---------+--------+

rst_transform

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio (rasterio.warp). Reprojection uses the GDAL build bundled with rasterio, whose projection database and driver set may be narrower than the heavyweight tier.

Signature: rst_transform(tile: Column, targetSrid: Column): Column — Reproject to target CRS. targetSrid must be a positive EPSG code; 0 or an unknown code is rejected with a clear error.

SQL:

-- Reproject to WGS84
SELECT
path,
gbx_rst_transform(tile, 4326) as wgs84_tile,
gbx_rst_srid(gbx_rst_transform(tile, 4326)) as new_srid
FROM rasters;

-- Reproject and clip
SELECT
path,
gbx_rst_clip(gbx_rst_transform(tile, 4326), boundary, true) as result
FROM rasters;
Example output
+----+----------------------------------------------+--------+
|path|wgs84_tile |new_srid|
+----+----------------------------------------------+--------+
|... |{null, <raster bytes>, {driver -> GTiff, ...}}|4326 |
+----+----------------------------------------------+--------+

rst_tryopen

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio.

Signature: rst_tryopen(tile: Column): Column — Validate raster can be opened.

SQL:

-- Filter valid rasters
SELECT * FROM rasters
WHERE gbx_rst_tryopen(tile) = true;

-- Identify corrupt rasters
SELECT path
FROM rasters
WHERE gbx_rst_tryopen(tile) = false;

-- Validation summary
SELECT
COUNT(*) as total,
SUM(CASE WHEN gbx_rst_tryopen(tile) THEN 1 ELSE 0 END) as valid,
SUM(CASE WHEN NOT gbx_rst_tryopen(tile) THEN 1 ELSE 0 END) as invalid
FROM rasters;
Example output
+-----+-----+-------+
|total|valid|invalid|
+-----+-----+-------+
|100 |98 |2 |
+-----+-----+-------+

rst_updatetype

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio. Output is re-encoded as GeoTIFF; a NoData value that is not representable in the target type is dropped.

Signature: rst_updatetype(tile: Column, newType: Column): Column — Convert raster data type.

SQL:

SELECT gbx_rst_updatetype(tile, 'Float32') as float_tile FROM rasters;
Example output
+----------------------------------------------+
|float_tile |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_resample

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio (rasterio.warp).

Resample a raster tile by a multiplicative factor via gdal.Warp -r, scaling pixel dimensions up or down relative to the source.

Signature: rst_resample(tile: Column, factor: Column, algorithm: Column): Column

Parameters: factor — multiplicative scale factor applied to both width and height (e.g. 2.0 doubles the pixel grid); algorithm — gdalwarp resampling method name (e.g. bilinear, near, cubic, cubicspline, lanczos, average)

SQL:

-- Upsample 2x with bilinear interpolation. Output dims = source dims * 2.
SELECT gbx_rst_resample(tile, 2.0, 'bilinear') AS upsampled FROM rasters;
Example output
+----------------------------------------------+
|upsampled |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_resample_to_res

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio (rasterio.warp).

Resample a raster tile to an explicit ground resolution in CRS units via gdal.Warp -tr.

Signature: rst_resample_to_res(tile: Column, xRes: Column, yRes: Column, algorithm: Column): Column

Parameters: xRes — target pixel width in CRS units (e.g. metres for a metric projection); yRes — target pixel height in CRS units; algorithm — gdalwarp resampling method name (e.g. average, bilinear, near)

SQL:

-- Downsample to a 100 m grid (metric CRS). 'average' weights cells by area.
SELECT gbx_rst_resample_to_res(tile, 100.0, 100.0, 'average') AS coarse
FROM rasters;
Example output
+----------------------------------------------+
|coarse |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_resample_to_size

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio (rasterio.warp).

Resample a raster tile to an explicit pixel grid size via gdal.Warp -ts.

Signature: rst_resample_to_size(tile: Column, widthPx: Column, heightPx: Column, algorithm: Column): Column

Parameters: widthPx — target output width in pixels; heightPx — target output height in pixels; algorithm — gdalwarp resampling method name (e.g. near for categorical rasters, bilinear for continuous)

SQL:

-- Force a 512 x 512 tile, near-neighbour for categorical rasters.
SELECT gbx_rst_resample_to_size(tile, 512, 512, 'near') AS sized FROM rasters;
Example output
+----------------------------------------------+
|sized |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_worldtorastercoord

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio.

Signature: rst_worldtorastercoord(tile: Column, worldX: Column, worldY: Column): Column — World to pixel coordinates as a struct with .x and .y fields.

rst_worldtorastercoord — full struct (pixel with .x and .y):

-- Find pixel coordinates for a specific location
SELECT
path,
gbx_rst_worldtorastercoord(tile, -122.4194, 37.7749) as pixel,
gbx_rst_worldtorastercoord(tile, -122.4194, 37.7749).x as col,
gbx_rst_worldtorastercoord(tile, -122.4194, 37.7749).y as row
FROM rasters;
Example output
+----+-----+---+---+
|path|pixel|col|row|
+----+-----+---+---+
|... |... |100|200|
+----+-----+---+---+

rst_worldtorastercoord — multiple points (e.g. from a locations table):

-- Sample raster at multiple points
WITH locations AS (
SELECT -122.4194 as lon, 37.7749 as lat UNION ALL
SELECT -122.4183, 37.7745
)
SELECT
l.lat,
l.lon,
gbx_rst_worldtorastercoord(r.tile, l.lon, l.lat) as pixel
FROM rasters r, locations l;
Example output
+-------+---------+-----+
|lat |lon |pixel|
+-------+---------+-----+
|37.7749|-122.4194|... |
|37.7745|-122.4183|... |
+-------+---------+-----+

rst_worldtorastercoordx / rst_worldtorastercoordy

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio.

Signature: rst_worldtorastercoordx(tile: Column, worldX: Column, worldY: Column): Column, rst_worldtorastercoordy(tile: Column, worldX: Column, worldY: Column): Column — Pixel column / row for a world coordinate.

rst_worldtorastercoordx — pixel column only:

SELECT
gbx_rst_worldtorastercoordx(tile, -122.4194, 37.7749) as pixel_col
FROM rasters;
Example output
+---------+
|pixel_col|
+---------+
|100 |
+---------+

rst_worldtorastercoordy — pixel row only:

SELECT
gbx_rst_worldtorastercoordy(tile, -122.4194, 37.7749) as pixel_row
FROM rasters;
Example output
+---------+
|pixel_row|
+---------+
|200 |
+---------+

Web-Mercator Tile Output

Reproject rasters to EPSG:3857 (Web Mercator) and emit slippy-map XYZ tiles. Pair with gbx_pmtiles_agg or the PMTiles writer to publish a raster pyramid as a single .pmtiles archive.

rst_to_webmercator

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio (rasterio.warp). Uses rasterio's bundled GDAL build, whose projection and driver coverage may be narrower than the heavyweight tier.

Signature: rst_to_webmercator(tile: Column): Column — Reproject a raster to EPSG:3857 (Web Mercator) using bilinear resampling by default. The returned tile carries srid = 3857.

SQL:

-- Reproject to web mercator before slippy-map tiling (default bilinear resampling).
SELECT
path,
gbx_rst_to_webmercator(tile) as web_tile,
gbx_rst_srid(gbx_rst_to_webmercator(tile)) as new_srid
FROM rasters;
Example output
+----+----------------------------------------------+--------+
|path|web_tile |new_srid|
+----+----------------------------------------------+--------+
|... |{null, <raster bytes>, {driver -> GTiff, ...}}|3857 |
+----+----------------------------------------------+--------+

rst_tilexyz

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rio-tiler + morecantile. Out-of-extent tiles return a transparent PNG (never null); available output formats depend on rasterio's bundled GDAL build.

Signature: rst_tilexyz(tile: Column, z: Column, x: Column, y: Column, format: Column, tileSize: Column, resampling: Column): Column — Render a single web-mercator XYZ tile from a raster as encoded image bytes (e.g. PNG, JPEG, WebP) at the given tile coordinates and pixel size.

SQL:

-- Render tile (z=10, x=512, y=512) as 256x256 PNG bytes.
SELECT
path,
gbx_rst_tilexyz(tile, 10, 512, 512, 'PNG', 256, 'bilinear') as tile_png
FROM rasters;
Example output
+----+--------+
|path|tile_png|
+----+--------+
|... |[BINARY]|
+----+--------+

rst_xyzpyramid

LightweightHeavyweight Streaming UDTF
Lightweight tier (pyrx)

Powered by rio-tiler + morecantile. Streams one XYZ tile row per intersecting tile via streaming UDTF. Bounded by max_z <= 20 and at most 1,000,000 candidate tiles.

Signature: rst_xyzpyramid(tile: Column, minZoom: Column, maxZoom: Column): Column — Generator: explode a raster into one row per intersecting (z, x, y) tile across a zoom range, producing PNG bytes per tile. Use LATERAL VIEW to materialize the rows; the output struct exposes z, x, y, and bytes.

SQL:

-- Explode a raster into per-tile rows across zoom levels 4..6 (PNG, 256px).
SELECT
path,
t.tile.z as z,
t.tile.x as x,
t.tile.y as y,
t.tile.bytes as png_bytes
FROM rasters
LATERAL VIEW gbx_rst_xyzpyramid(tile, 4, 6) AS t;
Example output
+----+-+-+-+---------+
|path|z|x|y|png_bytes|
+----+-+-+-+---------+
|... |4|5|6|[BINARY] |
+----+-+-+-+---------+

Vector↔raster bridge

Move data between the raster (tile) and vector (geom) worlds.

rst_rasterize

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio (rasterio.features).

Signature: rst_rasterize(geom: Column, burnValue: Column, xMin: Column, yMin: Column, xMax: Column, yMax: Column, width: Column, height: Column, srid: Column): Column — Burn a polygon (WKB) into a fresh GeoTIFF tile at the given extent and pixel dimensions. Pixels inside the polygon carry burnValue; pixels outside are NoData.

SQL:

-- WKB hex below is POLYGON((0 0, 10 0, 10 10, 0 10, 0 0)). The output `tile`
-- is a GTiff-backed raster at the given extent and resolution; pixels inside
-- the polygon carry the burn value (42.0), pixels outside are NoData.
SELECT gbx_rst_rasterize(
unhex('010300000001000000050000000000000000000000000000000000000000000000000024400000000000000000000000000000244000000000000024400000000000000000000000000000244000000000000000000000000000000000'),
42.0, 0.0, 0.0, 10.0, 10.0, 100, 100, 4326
) AS tile;
Example output
+----------------------------------------------+
|tile |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_polygonize

LightweightHeavyweight Streaming UDTF
Lightweight tier (pyrx)

Powered by rasterio (rasterio.features). In pyrx, gbx_rst_polygonize is a streaming Python UDTF — invoke it as a SQL LATERAL table function to stream polygon rows without buffering (avoids OOM on rasters with unbounded polygon fan-out):

SELECT t.geom_wkb, t.value
FROM <table>, LATERAL gbx_rst_polygonize(tile, band, connectedness) t

Signature (heavyweight): rst_polygonize(tile: Column, band: Column, connectedness: Column): Column — Trace contiguous-value regions of a tile into an array of features. Each feature carries the source pixel value as the value field.

SQL:

-- Round-trip: rasterize a polygon then immediately polygonize it. The output
-- array contains one feature per contiguous value region; each feature carries
-- the burn value as the `value` field.
SELECT gbx_rst_polygonize(
gbx_rst_rasterize(
unhex('010300000001000000050000000000000000000000000000000000000000000000000024400000000000000000000000000000244000000000000024400000000000000000000000000000244000000000000000000000000000000000'),
42.0, 0.0, 0.0, 10.0, 10.0, 100, 100, 4326
)
) AS features;
Example output
+------------------+
|features |
+------------------+
|[{[BINARY], 42.0}]|
+------------------+

Terrain Analysis

Thin wrappers around gdal.DEMProcessing for digital elevation model (DEM) derivatives. Each function takes a single-band DEM tile and returns a derived tile of the same footprint.

rst_slope

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by NumPy — a reimplementation of GDAL gdaldem. Results are close but not bit-identical: edge pixels are filled by replicating the border (gdaldem leaves them NoData) and NoData cells are not excluded from the 3×3 window.

Signature: rst_slope(tile: Column, unit: Column, scale: Column): Column — Compute slope per pixel. unit is 'degrees' or 'percent'; scale is the elevation/horizontal unit ratio (use 111120 for unprojected lon/lat in degrees).

SQL:

-- Slope in degrees per pixel. Use unit='percent' for rise/run, or pass scale
-- 111120 for unprojected geographic CRS (lon/lat in degrees).
SELECT gbx_rst_slope(tile, 'degrees', 1.0) AS slope FROM rasters;
Example output
+----------------------------------------------+
|slope |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_aspect

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by NumPy — a reimplementation of GDAL gdaldem; results are close but not bit-identical to the heavyweight tier (edge pixels are filled and NoData is not excluded from the 3×3 window).

Signature: rst_aspect(tile: Column, trigonometric: Column, zeroForFlat: Column): Column — Compass direction of steepest descent in degrees (0=N, 90=E, 180=S, 270=W). Flat areas return -9999 unless zeroForFlat = true. Set trigonometric = true for mathematical convention (0=E, counter-clockwise).

SQL:

-- Aspect in compass degrees (0=N, 90=E, 180=S, 270=W). Flat areas get -9999
-- unless zero_for_flat=true.
SELECT gbx_rst_aspect(tile, false, false) AS aspect FROM rasters;
Example output
+----------------------------------------------+
|aspect |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_hillshade

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by NumPy — a reimplementation of GDAL gdaldem; results are close but not bit-identical to the heavyweight tier (edge pixels are filled and NoData is not excluded from the 3×3 window).

Signature: rst_hillshade(tile: Column, azimuth: Column, altitude: Column, zFactor: Column): Column — 8-bit (0..255) shaded relief image. Common values: NW sun azimuth 315.0, altitude 45.0, zFactor = 1.0.

SQL:

-- 8-bit (0..255) hillshade: NW sun, 45-deg altitude, default z-factor.
SELECT gbx_rst_hillshade(tile, 315.0, 45.0, 1.0) AS hillshade FROM rasters;
Example output
+----------------------------------------------+
|hillshade |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_tri

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by NumPy — a reimplementation of GDAL gdaldem; results are close but not bit-identical to the heavyweight tier (edge pixels are filled and NoData is not excluded from the 3×3 window).

Signature: rst_tri(tile: Column): Column — Terrain Ruggedness Index — mean absolute difference between a pixel and its 8 neighbours. Useful for landscape-ecology habitat scoring.

SQL:

-- TRI: mean absolute neighbour difference; useful for landscape ecology.
SELECT gbx_rst_tri(tile) AS tri FROM rasters;
Example output
+----------------------------------------------+
|tri |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_tpi

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by NumPy — a reimplementation of GDAL gdaldem; results are close but not bit-identical to the heavyweight tier (edge pixels are filled and NoData is not excluded from the 3×3 window).

Signature: rst_tpi(tile: Column): Column — Topographic Position Index — pixel value minus the mean of its 8 neighbours. Positive values are ridges, negative values are valleys.

SQL:

-- TPI: difference from neighbour-mean; +ve = ridge, -ve = valley.
SELECT gbx_rst_tpi(tile) AS tpi FROM rasters;
Example output
+----------------------------------------------+
|tpi |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_roughness

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by NumPy — a reimplementation of GDAL gdaldem; results are close but not bit-identical to the heavyweight tier (edge pixels are filled and NoData is not excluded from the 3×3 window).

Signature: rst_roughness(tile: Column): Column — Largest absolute difference between a pixel and any of its 8 neighbours in a 3×3 window.

SQL:

-- Roughness: max absolute neighbour difference in a 3x3 window.
SELECT gbx_rst_roughness(tile) AS roughness FROM rasters;
Example output
+----------------------------------------------+
|roughness |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_color_relief

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by NumPy — a reimplementation of GDAL gdaldem color-relief; the gdaldem default color keyword is not supported and boundary interpolation may differ slightly from the heavyweight tier.

Signature: rst_color_relief(tile: Column, colorTablePath: Column): Column — Apply a gdaldem color table (elevation R G B [A] per line) to produce an RGBA visualization tile. Special values nv, default, 0%, and 100% are honored.

SQL:

-- Map elevation values to RGBA colors via a gdaldem color table.
SELECT gbx_rst_color_relief(tile, '{SAMPLE_DATA_BASE}/colortables/elevation.clr') AS rgba
FROM rasters;
Example output
+----------------------------------------------+
|rgba |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

Spectral Indices

Multi-band satellite math built on gbx_rst_mapalgebra. Band arguments are 1-based GDAL band indices; the output is always a single-band Float32 GeoTIFF tile. gbx_rst_ndvi is documented under Operations.

rst_evi

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio + NumPy. Valid pixels match the heavyweight tier; pixels with a zero denominator are set to NoData (-9999) rather than left as non-finite values.

Signature: rst_evi(tile: Column, redBand: Column, nirBand: Column, blueBand: Column): Column — Enhanced Vegetation Index. Formula: G * (NIR - Red) / (NIR + C1*Red - C2*Blue + L) with MODIS canonical coefficients G=2.5, L=1.0, C1=6.0, C2=7.5.

SQL:

-- EVI = G * (NIR - Red) / (NIR + C1*Red - C2*Blue + L). Defaults follow the
-- MODIS canonical coefficients: L=1.0, C1=6.0, C2=7.5, G=2.5.
SELECT gbx_rst_evi(tile, 1, 2, 3) AS evi FROM rasters;
Example output
+----------------------------------------------+
|evi |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_savi

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio + NumPy. Valid pixels match the heavyweight tier; pixels with a zero denominator are set to NoData (-9999) rather than left as non-finite values.

Signature: rst_savi(tile: Column, redBand: Column, nirBand: Column, l: Column): Column — Soil-Adjusted Vegetation Index. Formula: (NIR - Red) / (NIR + Red + L) * (1 + L). L = 0.5 (the canonical default) is a balanced soil/vegetation tradeoff; L = 0 reduces SAVI to NDVI.

SQL:

-- SAVI = (NIR - Red) / (NIR + Red + L) * (1 + L). L=0.5 (default) is a
-- balanced soil-vegetation tradeoff; L=0 reduces to NDVI.
SELECT gbx_rst_savi(tile, 1, 2, 0.5) AS savi FROM rasters;
Example output
+----------------------------------------------+
|savi |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_ndwi

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio + NumPy. Valid pixels match the heavyweight tier; pixels with a zero denominator are set to NoData (-9999) rather than left as non-finite values.

Signature: rst_ndwi(tile: Column, greenBand: Column, nirBand: Column): Column — Normalized Difference Water Index (McFeeters 1996). Formula: (Green - NIR) / (Green + NIR). Positive values typically indicate open water.

SQL:

-- NDWI (McFeeters 1996) = (Green - NIR) / (Green + NIR). Positive values
-- typically indicate open water.
SELECT gbx_rst_ndwi(tile, 1, 2) AS ndwi FROM rasters;
Example output
+----------------------------------------------+
|ndwi |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_nbr

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio + NumPy. Valid pixels match the heavyweight tier; pixels with a zero denominator are set to NoData (-9999) rather than left as non-finite values.

Signature: rst_nbr(tile: Column, nirBand: Column, swirBand: Column): Column — Normalized Burn Ratio. Formula: (NIR - SWIR) / (NIR + SWIR). The pre-/post-fire difference (dNBR) is the canonical burn-severity index.

SQL:

-- NBR = (NIR - SWIR) / (NIR + SWIR). Difference of pre-fire and post-fire
-- NBR (dNBR) is the canonical burn-severity index.
SELECT gbx_rst_nbr(tile, 2, 3) AS nbr FROM rasters;
Example output
+----------------------------------------------+
|nbr |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_index

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio + NumExpr. Generic named-index dispatcher over a band_map; single-band Float32. Zero-denominator pixels are set to NoData (−9999).

Signature: rst_index(tile: Column, indexName: Column, bandMap: Column): Column — Generic dispatcher that picks a built-in formula by name and wires bands via a MAP<STRING, INT> (e.g. map('red', 1, 'nir', 2)). Built-in names: ndvi, gndvi, msavi, ndvi_re, ndmi, ndsi.

SQL:

-- Generic dispatcher - pick a built-in formula by name and wire bands by a
-- MAP<STRING, INT>. Built-ins: ndvi, gndvi, msavi, ndvi_re, ndmi, ndsi.
SELECT gbx_rst_index(tile, 'ndvi', map('red', 1, 'nir', 2)) AS ndvi
FROM rasters;
Example output
+----------------------------------------------+
|ndvi |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

Pixel ops + extraction

Per-pixel transformations and band-level extraction.

rst_band

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio.

Signature: rst_band(tile: Column, bandIndex: Column): Column — Extract a single band from a multi-band raster as a new single-band tile (gdal.Translate -b N). 1-based band index.

-- Pull band 1 (1-based) as a fresh single-band tile.
SELECT gbx_rst_band(tile, 1) AS b1 FROM rasters;
Example output
+----------------------------------------------+
|b1 |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_buildoverviews

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio. Builds internal GeoTIFF overviews.

Signature: rst_buildoverviews(tile: Column, levels: Column, [resampling: Column = lit("average")]): Column — Add pyramid overview levels to a tile via ds.BuildOverviews. levels is an ARRAY<INT> (e.g. array(2, 4, 8, 16)); resampling is one of nearest, average, gauss, cubic, cubicspline, lanczos, bilinear, mode.

-- Add 2x / 4x overviews to the tile via the 'average' resampling.
SELECT gbx_rst_buildoverviews(tile, array(2, 4), 'average') AS withovr
FROM rasters;
Example output
+----------------------------------------------+
|withovr |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_fillnodata

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio (rasterio.fill).

Signature: rst_fillnodata(tile: Column, [maxSearchDist: Column = lit(100), smoothingIter: Column = lit(0)]): Column — Fill NoData pixels via gdal.FillNodata using inverse-distance interpolation from neighbors within maxSearchDist pixels. smoothingIter applies an optional post-fill 3×3 smoothing pass.

-- Fill NoData holes searching up to 100 pixels in each direction.
SELECT gbx_rst_fillnodata(tile, 100.0, 0) AS filled FROM rasters;
Example output
+----------------------------------------------+
|filled |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_histogram

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio + NumPy. Per-band bucket counts via numpy.histogram; map keys are band_<i> (1-based).

Signature: rst_histogram(tile: Column, [bands: Column = null, nBuckets: Column = lit(256), min: Column = null, max: Column = null, includeNodata: Column = lit(false)]): Column — Compute per-band histograms via band.GetHistogram. Returns MAP<STRING, ARRAY<LONG>> keyed by "band_<n>" with bucket counts. If bands is null, all bands are processed; if min / max are null, GDAL auto-detects the range.

-- 16 equal-width buckets over [0, 1000]; one entry per band keyed band_<i>.
SELECT gbx_rst_histogram(tile, 16, cast(0 as double), cast(1000 as double), false) AS hist
FROM rasters;
Example output
+-------------------------------+
|hist |
+-------------------------------+
|{band_1 -> [120, 340, 510, 88]}|
+-------------------------------+

rst_sample

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio. Point geometries only; the point is assumed to be in the raster's CRS.

Signature: rst_sample(tile: Column, geom: Column): Column — Sample the raster at the geometry's location(s). For a POINT, returns ARRAY<DOUBLE> of one value per band at the nearest pixel. Geometry is interpreted in EPSG:4326 lon/lat unless its EWKB carries a different SRID.

-- Sample at a known lon/lat (point must be in the raster's CRS).
SELECT gbx_rst_sample(tile, 'POINT(-0.13 51.5)') AS values FROM rasters;
Example output
+-------------------+
|values |
+-------------------+
|[12.5, 88.0, 240.0]|
+-------------------+

rst_setsrid

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rasterio. Stamps the CRS without reprojecting.

Signature: rst_setsrid(tile: Column, srid: Column): Column — Stamp an EPSG code onto a raster that lacks (or has a wrong) spatial reference. Does NOT reproject — only sets ds.SetProjection(...). Use rst_transform when you need an actual reprojection.

-- Tag the tile as EPSG:4326 without warping pixels.
-- Use rst_transform if you actually need a reprojection.
SELECT gbx_rst_setsrid(tile, 4326) AS tagged FROM rasters;
Example output
+----------------------------------------------+
|tagged |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_threshold

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by NumPy. This tier keeps each passing pixel's original value and sets failing pixels to NoData; the heavyweight tier instead returns a 0/1 binary mask (Float32). Choose the tier that matches the output you need.

Signature: rst_threshold(tile: Column, op: Column, value: Column): Column — Binarize the raster: pixels matching op value get 1, others get 0. op is one of >, >=, <, <=, ==, !=. Output is a Byte raster (0/1) sized to the input extent. Implemented as a gbx_rst_mapalgebra template.

-- Mark all pixels above 100 m as 1, others as 0.
SELECT gbx_rst_threshold(tile, '>', 100.0) AS mask FROM rasters;
Example output
+----------------------------------------------+
|mask |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

Analysis

Higher-level analytical transforms wrapping single GDAL primitives — COG layout publishing, proximity surfaces, contour extraction, and viewshed analysis.

rst_cog_convert

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by rio-cogeo. Re-encodes the tile as a Cloud Optimized GeoTIFF (validated with cog_validate); compression maps to a rio-cogeo profile (deflate, lzw, zstd, lerc, jpeg, webp, none). The tile's metadata.driver is GTiff (a COG is a valid GeoTIFF).

Signature: rst_cog_convert(tile: Column, [compression: Column = lit("DEFLATE"), blocksize: Column = lit(512), overviewResampling: Column = lit("AVERAGE")]): Column — Re-layout a raster tile as a Cloud Optimized GeoTIFF via gdal.Translate -of COG. compression is one of NONE, DEFLATE, LZW, ZSTD, LERC, JPEG, WEBP. blocksize is the internal tile size in pixels (square). overviewResampling is the algorithm for the auto-generated overview pyramid. Output is a GTiff-on-disk variant suitable for HTTP range serving.

-- Convert to COG with DEFLATE compression, 512-pixel blocks, AVERAGE overviews.
SELECT gbx_rst_cog_convert(tile, 'DEFLATE', 512, 'AVERAGE') AS cog
FROM rasters;
Example output
+----------------------------------------------+
|cog |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_proximity

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by SciPy (scipy.ndimage.distance_transform_edt). Distance to the nearest source pixel (target_values, or any non-zero pixel by default), in GEO (CRS units) or PIXEL units; pixels beyond max_distance → NoData −1.0. Single-band Float32.

Signature: rst_proximity(tile: Column, [targetValues: Column = null, distUnits: Column = lit("GEO"), maxDistance: Column = null]): Column — Compute a Float32 raster where each pixel holds the distance to the nearest source pixel via gdal.ComputeProximity. targetValues is a comma-separated list of source-pixel values (e.g. "1,2,3"); null means any non-NoData pixel is a target. distUnits is "GEO" (CRS ground units, default) or "PIXEL". maxDistance caps the output; pixels beyond it get the NoData sentinel -1.0.

-- Distance in pixels to any non-NoData pixel; cap distances at 100 pixels.
SELECT gbx_rst_proximity(tile, '', 'PIXEL', cast(100.0 as double)) AS dist
FROM rasters;
Example output
+----------------------------------------------+
|dist |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

rst_contour

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by scikit-image (measure.find_contours). Returns ARRAY<struct(geom_wkb, value)> of contour LineStrings (in the raster CRS) at each fixed level, or at base + k*interval across the data range when levels is empty; NoData is masked before tracing. The marching-squares line geometry differs slightly from the heavyweight GDAL contours.

Signature: rst_contour(tile: Column, levels: Column, [interval: Column = lit(0.0), base: Column = lit(0.0), attrField: Column = lit("elev")]): Column — Generate contour LineString features via gdal.ContourGenerateEx. Pass a non-empty levels ARRAY<DOUBLE> for fixed contour values, or pass array() and set interval (>0) for equal-step contours at base + n*interval. Returns ARRAY<struct(geom_wkb BINARY, value DOUBLE)> — one entry per contour line in the source raster's CRS.

-- Equal-interval contours every 10 m. Pass array() of fixed levels to override.
SELECT gbx_rst_contour(tile, array(), 10.0, 0.0, 'elev') AS contours
FROM rasters;
Example output
+--------------------------------------+
|contours |
+--------------------------------------+
|[{[BINARY], 100.0}, {[BINARY], 200.0}]|
+--------------------------------------+

rst_viewshed

LightweightHeavyweight
Lightweight tier (pyrx)

Powered by xarray-spatial (xrspatial.viewshed). Returns a Byte tile (255 visible / 0 invisible) from a DEM tile and a POINT observer. The CPU line-of-sight scan differs from the heavyweight GDAL sweep (no earth-curvature correction), but the binary visible/invisible classification matches. Pulls numba — the heaviest pyrx dependency.

Signature: rst_viewshed(tile: Column, observerGeom: Column, observerHeight: Column, [targetHeight: Column = lit(1.6), maxDistance: Column = null]): Column — Compute a binary viewshed Byte raster (255 = visible, 0 = invisible / out-of-range) from a DEM tile and an observer POINT via gdal.ViewshedGenerate. observerGeom is WKB / WKT POINT in the raster's CRS; non-POINT geometries raise an error at execution time. Heights are above the DEM at each pixel. maxDistance clips the search radius; null = unlimited.

-- Visibility from observer at (-73.5, 40.5), eye 100 m, target 1.6 m, cap 5000 m.
SELECT gbx_rst_viewshed(tile, 'POINT(-73.5 40.5)', 100.0, 1.6, 5000.0) AS vs
FROM rasters;
Example output
+----------------------------------------------+
|vs |
+----------------------------------------------+
|{null, <raster bytes>, {driver -> GTiff, ...}}|
+----------------------------------------------+

Heavyweight-tier specifics

Every RasterX function runs in both execution tiers, so the per-function reference above applies regardless of tier. This section documents one heavyweight-tier execution detail: the GDAL VRT Python pixel-function configuration used by gbx_rst_combineavg / gbx_rst_derivedband (and their _agg forms).

VRT Python pixel functions

gbx_rst_combineavg, gbx_rst_combineavg_agg, gbx_rst_derivedband, and gbx_rst_derivedband_agg evaluate a Python expression on each pixel via GDAL's VRT Python pixel-function API. That API is gated behind the GDAL config option GDAL_VRT_ENABLE_PYTHON, which GeoBrix sets to NO at executor startup (see Security - Restrict GDAL drivers). When you call one of the four functions above, GeoBrix flips the option to YES for the duration of that call only — via the internal GDALManager.withVrtPython bracket — and restores NO immediately on return. You don't need to set anything on the cluster or in your notebook to use the built-in functions.

When you need to enable it yourself

If you're invoking the GDAL Python bindings (from osgeo import gdal) directly — outside the built-in RasterX functions — and you read a VRT that declares a <PixelFunctionLanguage>Python</...> band, you'll get an empty/null read unless you enable the option in the same process. Pick one of:

Python — programmatic, scoped to your read. Recommended in all cases. Mirrors what GeoBrix does internally, works for both driver-side pyspark.sql calls and inside mapPartitions / mapInPandas UDFs that load VRT-with-pyfunc via osgeo.gdal, and survives interleaving with GeoBrix built-in calls (each GeoBrix call resets the option to NO on exit, so re-set it on every read):

from osgeo import gdal

gdal.SetConfigOption("GDAL_VRT_ENABLE_PYTHON", "YES")
try:
ds = gdal.Open("/path/to/your/vrt-with-pixel-function.vrt")
arr = ds.GetRasterBand(1).ReadAsArray()
ds = None
finally:
gdal.SetConfigOption("GDAL_VRT_ENABLE_PYTHON", "NO")

Cluster env var — for Python-worker processes only. Setting spark.executorEnv.GDAL_VRT_ENABLE_PYTHON YES on the cluster works for Python UDF workers (a separate process from the JVM, where GDAL initializes from env vars). It does not help JVM-side reads — GeoBrix calls gdal.SetConfigOption("GDAL_VRT_ENABLE_PYTHON", "NO") at executor JVM startup, and SetConfigOption takes precedence over the env var. Prefer the programmatic form above unless you have a strong reason to globally enable.

Scala / JVM code. If you're writing custom Spark expressions that consume Python-pixel VRTs, wrap the read/translate in the same helper GeoBrix uses internally — it refcounts the option so concurrent tasks on the same executor JVM compose safely:

import com.databricks.labs.gbx.rasterx.gdal.GDALManager

val result = GDALManager.withVrtPython {
val ds = org.gdal.gdal.gdal.Open(vrtPath)
// ... GDAL reads / translates here see the Python pixel function ...
ds
}

Trusted-modules variant

GDAL also accepts GDAL_VRT_ENABLE_PYTHON=TRUSTED_MODULES plus a GDAL_VRT_PYTHON_TRUSTED_MODULES allowlist if you want pixel-function code restricted to specific Python module prefixes. GeoBrix uses the plain YES form because the pixel-function source is constructed in-process from trusted (geobrix-generated) strings, never from user-supplied VRT XML on disk. If your custom code path reads VRTs whose <PixelFunctionCode> originates from less-trusted sources, switch to the TRUSTED_MODULES form and allowlist only what you intend to load.


Next Steps