Skip to main content

RasterX

RasterX

Full API reference

For the complete list of RasterX functions with parameters and examples, see the RasterX Function Reference.

RasterX is GeoBrix's raster data processing package, providing comprehensive tools for working with raster datasets such as satellite imagery, elevation models, and other gridded spatial data.

Overview

RasterX is a refactor and improvement of Mosaic raster functions. Since Databricks product does not (yet) support anything built-in specifically for raster processing, RasterX provides a "fully" gap-filling capability for raster operations on the Databricks platform.

Key Features

  • GDAL-Powered: Leverages GDAL for robust raster format support
  • Distributed Processing: Built on Spark for scalable raster operations
  • Multiple Format Support: GeoTIFF, NetCDF, and other GDAL-supported formats
  • Metadata Extraction: Comprehensive raster metadata access
  • Raster Operations: Clipping, resampling, transformations
  • Band Operations: Multi-band raster support

Function Categories

Accessors

Functions to access raster properties and metadata:

  • gbx_rst_boundingbox - Bounding box of the raster
  • gbx_rst_width - Raster width in pixels
  • gbx_rst_height - Raster height in pixels
  • gbx_rst_numbands - Number of bands
  • gbx_rst_metadata - Raster metadata map
  • gbx_rst_srid - Spatial reference identifier
  • gbx_rst_georeference - Georeference parameters
  • gbx_rst_pixelwidth, gbx_rst_pixelheight - Pixel size
  • gbx_rst_upperleftx, gbx_rst_upperlefty - Upper-left corner
  • gbx_rst_scalex, gbx_rst_scaley, gbx_rst_rotation, gbx_rst_skewx, gbx_rst_skewy - Geotransform components
  • gbx_rst_format - Raster format (e.g. GTiff)
  • gbx_rst_getnodata - NoData value
  • gbx_rst_bandmetadata - Band metadata
  • gbx_rst_avg, gbx_rst_min, gbx_rst_max, gbx_rst_median - Pixel statistics
  • gbx_rst_pixelcount - Number of pixels
  • gbx_rst_memsize - Approximate memory size
  • gbx_rst_type - Raster data type
  • gbx_rst_summary - Summary statistics
  • gbx_rst_subdatasets - Subdataset names (e.g. NetCDF/GRIB)
  • gbx_rst_getsubdataset - Open a subdataset by name

Constructors

  • gbx_rst_fromfile - Load raster from file path
  • gbx_rst_fromcontent - Create raster from binary content
  • gbx_rst_frombands - Build raster from band expressions

Transformations and operations

  • gbx_rst_clip - Clip raster by geometry
  • gbx_rst_transform - Reproject to a target CRS
  • gbx_rst_merge - Merge multiple rasters
  • gbx_rst_combineavg - Average multiple rasters (same extent)
  • gbx_rst_asformat - Write to a different format (e.g. COG)
  • gbx_rst_convolve - Convolution filter
  • gbx_rst_filter - Custom filter expression
  • gbx_rst_mapalgebra - Map algebra expression
  • gbx_rst_derivedband - Derive band via Python UDF
  • gbx_rst_ndvi - NDVI from red/NIR bands
  • gbx_rst_dtmfromgeoms - Rasterize geometries to DTM
  • gbx_rst_initnodata - Initialize NoData
  • gbx_rst_updatetype - Change raster data type
  • gbx_rst_isempty - Test if raster is empty
  • gbx_rst_tryopen - Open raster or return NULL on failure
  • gbx_rst_rastertoworldcoord, gbx_rst_rastertoworldcoordx, gbx_rst_rastertoworldcoordy - Pixel to world coordinates
  • gbx_rst_worldtorastercoord, gbx_rst_worldtorastercoordx, gbx_rst_worldtorastercoordy - World to pixel coordinates

Generators

  • gbx_rst_separatebands - Explode multi-band raster into rows per band
  • gbx_rst_retile - Retile rasters to a given tile size
  • gbx_rst_maketiles - Build tiles from grid spec
  • gbx_rst_tooverlappingtiles - Overlapping tile grid
  • gbx_rst_h3_tessellate - Tessellate raster into H3 cells

H3 grid aggregation

  • gbx_rst_h3_rastertogridavg - Average raster values per H3 cell
  • gbx_rst_h3_rastertogridcount - Pixel count per H3 cell
  • gbx_rst_h3_rastertogridmax, gbx_rst_h3_rastertogridmin, gbx_rst_h3_rastertogridmedian - Min/max/median per H3 cell

Aggregations

  • gbx_rst_combineavg_agg - Average multiple rasters (aggregate)
  • gbx_rst_merge_agg - Merge rasters with aggregation
  • gbx_rst_derivedband_agg - Derived band aggregate

Usage Examples

Python/PySpark

from databricks.labs.gbx.rasterx import functions as rx

# Sample data path (see Sample Data guide; use your Volume path if different)
raster_path = SAMPLE_RASTER_PATH

rx.register(spark)

raster_df = spark.read.format("gdal").load(raster_path)

metadata_df = raster_df.select(
"source",
rx.rst_width("tile").alias("width"),
rx.rst_height("tile").alias("height"),
rx.rst_numbands("tile").alias("bands"),
rx.rst_srid("tile").alias("srid"),
)
metadata_df.show()
Example output
+--------------------+-----+------+-----+----+
|source |width|height|bands|srid|
+--------------------+-----+------+-----+----+
|.../nyc_sentinel2...|10980|10980 |1 |4326|
+--------------------+-----+------+-----+----+

Scala

import com.databricks.labs.gbx.rasterx.{functions => rx}
import org.apache.spark.sql.functions._

// Register functions
rx.register(spark)

// Read raster files (sample data path; see Sample Data guide)
val rasterPath = "/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/sentinel2/nyc_sentinel2_red.tif"
val rasterDf = spark.read.format("gdal").load(rasterPath)

// Get metadata
val metadataDf = rasterDf.select(
col("path"),
rx.rst_width(col("tile")).alias("width"),
rx.rst_height(col("tile")).alias("height"),
rx.rst_numbands(col("tile")).alias("num_bands")
)

metadataDf.show()
Example output
+--------------------+-----+------+----------+
|path |width|height|num_bands |
+--------------------+-----+------+----------+
|.../nyc_sentinel2...|10980|10980 |1 |
+--------------------+-----+------+----------+

SQL

SQL_RASTERX_USAGE = f"""-- Register functions first in Python/Scala notebook
-- Then use in SQL

-- Read raster data (sample data path; see Sample Data guide)
CREATE OR REPLACE TEMP VIEW rasters AS
SELECT * FROM gdal.`{SAMPLE_RASTER_PATH}`;

-- Extract metadata
SELECT
path,
gbx_rst_width(tile) as width,
gbx_rst_height(tile) as height,
gbx_rst_numbands(tile) as num_bands,
gbx_rst_srid(tile) as srid
FROM rasters;"""
Example output
+--------------------+-----+------+----------+----+
|path |width|height|num_bands |srid|
+--------------------+-----+------+----------+----+
|.../nyc_sentinel2...|10980|10980 |1 |4326|
+--------------------+-----+------+----------+----+

Next Steps