Skip to main content

GridX Function Reference

GridX is fully lightweight as of v0.4.0: the quadbin (gbx_quadbin_*), BNG (gbx_bng_*), and custom-grid (gbx_custom_*) functions all run in both tiers LightweightHeavyweight. Each section below is marked with its tier. See Choosing an Execution Tier for the lightweight vs heavyweight comparison.

Complete reference for all GridX discrete-global-grid functions — CARTO Quadbin v0 and British National Grid (BNG).

Overview

GridX is GeoBrix's discrete-global-grid indexing package. As of v0.4.0 it ships two grid systems: CARTO quadbin v0 for web-mercator-aligned global indexing and British National Grid (BNG) for Great Britain workloads.

  • Quadbin (CARTO v0) — a global zoom-indexed tile addressing scheme aligned with web-mercator slippy maps, compatible with CARTO's CDB_QuadKey IDs. Cell (z, x, y) coordinates align with the same XYZ tile grid that PMTiles / MVT readers consume — natural for slippy-map heatmaps and global analytics.
  • BNG (British National Grid) — the Ordnance Survey National Grid (OSGB36) used in Great Britain for spatial indexing and location-based services. Specialized for UK-based spatial data.
Registration and import paths
  • Quadbin: databricks.labs.gbx.gridx.quadbin (Python) / com.databricks.labs.gbx.gridx.quadbin (Scala)
  • BNG: databricks.labs.gbx.gridx.bng (Python) / com.databricks.labs.gbx.gridx.bng (Scala)

Use RegisterBatch with functions=gridx.quadbin or functions=gridx.bng to register just one subpackage, or functions=all for everything.

Key Features

  • Grid Cell Operations: Create, manipulate, and query BNG grid cells
  • Area Calculations: Calculate areas of grid cells at different precisions (returns square kilometres)
  • Coordinate Conversion: Convert between grid references and coordinates
  • Spatial Indexing: Use BNG or quadbin for efficient spatial indexing
  • Multi-Resolution Support: Work with different grid resolutions (BNG: 1–6 integer indices; Quadbin: zoom 0..26)
  • K-Ring / K-Loop Neighbourhoods: Filled rings and hollow rings for both grid systems
  • Polyfill and Tessellation: Cover geometries with cells; tessellation returns per-cell clipped chip geometries

Usage Examples

Python/PySpark

from databricks.labs.gbx.gridx.bng import functions as bx

bx.register(spark)

# London (TQ area) in BNG: easting 530000, northing 180000
# Resolution: BNG resolution index (3 = 1 km) or string ('1km'). gbx_bng_cellarea returns km².
bng_cells = spark.sql(
"""
SELECT
gbx_bng_pointascell('POINT(530000 180000)', '1km') as bng_cell,
gbx_bng_cellarea(gbx_bng_pointascell('POINT(530000 180000)', '1km')) as cell_area_km2
"""
)
bng_cells.show()
Example output
+----------+--------------+
|bng_cell |cell_area_km2 |
+----------+--------------+
|TQ3080 |1.0 |
+----------+--------------+

Scala

import com.databricks.labs.gbx.gridx.bng.{functions => bx}
import org.apache.spark.sql.functions._

// Register functions
bx.register(spark)

// Calculate cell area
val areaDf = spark.sql("SELECT gbx_bng_cellarea('TQ', 1000) as area")
areaDf.show()

// Create BNG cells from points (point as WKT; GeoBrix does not accept st_point)
val pointsDf = Seq(
(51.5074, -0.1278)
).toDF("lat", "lon")

val bngCells = pointsDf.select(
col("lat"),
col("lon"),
expr("gbx_bng_pointascell(concat('POINT(', cast(lon as string), ' ', cast(lat as string), ')'), 1000)").alias("bng_cell")
)

bngCells.show()
Example output
+-----------+
|area |
+-----------+
|1000000.0 |
+-----------+

+-------+-------+----------+
|lat |lon |bng_cell |
+-------+-------+----------+
|51.5074|-0.1278|TQ 30 80 |
+-------+-------+----------+

SQL

-- Register functions first in Python/Scala notebook

-- Point in BNG coordinates (eastings, northings, EPSG:27700). Resolution: BNG resolution string ('1km') or index (3).
-- gbx_bng_cellarea returns square kilometres.
SELECT
gbx_bng_pointascell('POINT(530000 180000)', '1km') as bng_cell_1km,
gbx_bng_cellarea(gbx_bng_pointascell('POINT(530000 180000)', '1km')) as area_km2;
Example output
+------------+----------+
|bng_cell_1km|area_km2 |
+------------+----------+
|TQ3080 |1.0 |
+------------+----------+
SQL function prefixes

In SQL, GridX functions are prefixed with gbx_ (e.g. gbx_bng_aswkb, gbx_quadbin_pointascell). For SQL, Python, and Scala usage patterns, see Language Bindings.

Common setup

Run this once before the examples below. It registers GridX (BNG) so you can use bng_* in Python and gbx_bng_* in SQL.

from databricks.labs.gbx.gridx.bng import functions as bx
bx.register(spark)
Example output
GridX (BNG) registered. You can now use bng_* functions in Python and gbx_bng_* in SQL.
Map any GridX cell set with plot_static

vizx.plot_static renders a column of cell ids straight from a Spark DataFrame as a static, GitHub-renderable map over a basemap — and it's the one helper that covers every GridX grid: pass grid_system="quadbin", "bng", or "custom" (as well as "h3"). Each grid's native CRS is handled for you (e.g. BNG's EPSG:27700 is reprojected for the basemap), and "custom" takes the same grid spec via grid_conf=. It's the quickest way to eyeball the cells the functions below produce. See plot_static for the full parameter list.


Quadbin (CARTO v0)

LightweightHeavyweight Quadbin runs in both tiers. The lightweight tier is the pygx quadbin package — powered by the quadbin library + shapely; quadbin_distance / quadbin_polyfill cell math mirrors the heavyweight Quadbin.scala, and geometry outputs are EWKB SRID 4326 — identical gbx_quadbin_* SQL names, so it is a drop-in swap.

GeoBrix v0.4.0 adds a gridx/quadbin subpackage implementing the CARTO quadbin v0 64-bit packed (z, x, y) tile encoding used by Snowflake, dbt, Felt, and CARTO. Coordinates are EPSG:4326 lon/lat on the user-facing API; cells are encoded as web-mercator XYZ tiles internally. Resolutions range from 0 (whole world) to 26 (sub-metre).

Registration

Quadbin functions are under gridx.quadbin — independent of gridx.bng. Call functions.register(spark) once per session to install the gbx_quadbin_* SQL functions.

CARTO Quadbin v0 cells encode (z, x, y) web-mercator tile coordinates as a single BIGINT. Cell IDs are interoperable with CARTO's CDB_QuadKey and align with the slippy-map tile grid used by gbx_rst_xyzpyramid and gbx_st_asmvt_pyramid. Exact cell-set parity between the lightweight and heavyweight tiers is enforced by the cross-tier parity suite.

Resolution (zoom) range:

  • gbx_quadbin_pointascell, gbx_quadbin_resolution, gbx_quadbin_kring, gbx_quadbin_distance accept zoom 0..26.
  • gbx_quadbin_polyfill and gbx_quadbin_tessellate accept zoom 0..20.

quadbin_pointascell

LightweightHeavyweight

Convert a lon/lat coordinate (EPSG:4326) to the quadbin cell containing it at the given zoom.

Lightweight tier (pygx)

Powered by the quadbin package (numpy-vectorized encoder). Encodes WGS84 lon/lat to a quadbin cell at the given zoom; bit-identical to the heavyweight Quadbin.scala (incl. the antimeridian/pole tile clamp).

Signature: quadbin_pointascell(lon: Column, lat: Column, zoom: Column): Column

Returns:

  • BIGINT quadbin cell ID

SQL:

SELECT gbx_quadbin_pointascell(-122.4194, 37.7749, 10) as sf_cell;
Example output
+-------------------+
|sf_cell |
+-------------------+
|5233961839712272383|
+-------------------+

quadbin_aswkb

LightweightHeavyweight

Return the quadbin cell footprint as EWKB (SRID=4326) — the four-corner polygon of the tile in lon/lat.

Lightweight tier (pygx)

Powered by the quadbin package + shapely. Cell boundary polygon as EWKB (SRID 4326).

Signature: quadbin_aswkb(cellId: Column): Column

Returns:

  • Binary EWKB polygon (SRID-tagged 4326)

SQL:

SELECT gbx_quadbin_aswkb(gbx_quadbin_pointascell(0.0, 0.0, 8)) as wkb;

quadbin_centroid

LightweightHeavyweight

Return the quadbin cell centroid as an EWKB POINT (SRID=4326).

Lightweight tier (pygx)

Powered by the quadbin package + shapely. Cell centroid (bbox-corner mean, matching Quadbin.scala) as EWKB point (SRID 4326).

Signature: quadbin_centroid(cellId: Column): Column

Returns:

  • Binary EWKB point (SRID-tagged 4326)

SQL:

SELECT gbx_quadbin_centroid(gbx_quadbin_pointascell(0.0, 0.0, 8)) as centroid;

quadbin_resolution

LightweightHeavyweight

Return the resolution (zoom) of a quadbin cell.

Lightweight tier (pygx)

Powered by the quadbin package (numpy-vectorized). Extracts the zoom level from a cell id.

Signature: quadbin_resolution(cellId: Column): Column

Returns:

  • INT zoom level (0..26)

SQL:

SELECT gbx_quadbin_resolution(gbx_quadbin_pointascell(0.0, 0.0, 12)) as z;

quadbin_polyfill

LightweightHeavyweight

Polyfill a geometry's bounding box with all quadbin cells at the given zoom.

Lightweight tier (pygx)

Powered by the quadbin package + shapely. Enumerates the cells covering the geometry's bounding box at the resolution, mirroring Quadbin.scala's envelope semantics.

Signature: quadbin_polyfill(geom: Column, zoom: Column): Column

Returns:

  • ARRAY<BIGINT> of cell IDs covering the bbox

SQL:

SELECT gbx_quadbin_polyfill(
st_geomfromtext('POLYGON((-1 -1, 1 -1, 1 1, -1 1, -1 -1))'), 5
) as cells;
Example output
+--------------------------+
|cells |
+--------------------------+
|[5215660717881425919, ...]|
+--------------------------+

quadbin_kring

LightweightHeavyweight

Return all cells within Chebyshev distance k of a quadbin cell (inclusive of the center cell).

Lightweight tier (pygx)

Powered by the quadbin package. All cells within Chebyshev distance k (inclusive).

Signature: quadbin_kring(cellId: Column, k: Column): Column

Returns:

  • ARRAY<BIGINT> of cell IDs (length (2k+1)^2)

SQL:

SELECT gbx_quadbin_kring(gbx_quadbin_pointascell(0.0, 0.0, 10), 1) as ring;
Example output
+-------------------------------------+
|ring |
+-------------------------------------+
|[5227553336189779967, ..., (9 cells)]|
+-------------------------------------+

quadbin_tessellate

LightweightHeavyweight

Tessellate a geometry into quadbin cells. Like quadbin_polyfill but returns the per-cell geometry chip alongside the cell ID, suitable for chip-based join patterns.

Lightweight tier (pygx)

Powered by the quadbin package + shapely. Bbox polyfill then per-cell intersection with the input geometry; one STRUCT<cell, geom> per chip (EWKB, SRID 4326).

Signature: quadbin_tessellate(geom: Column, zoom: Column): Column

Returns:

  • ARRAY<STRUCT<cell: BIGINT, geom: BINARY>>

SQL:

SELECT gbx_quadbin_tessellate(
st_geomfromtext('POLYGON((-1 -1, 1 -1, 1 1, -1 1, -1 -1))'), 5
) as chips;

quadbin_cellunion

LightweightHeavyweight

Union an ARRAY<BIGINT> of quadbin cells into a single MultiPolygon EWKB.

Lightweight tier (pygx)

Powered by shapely (union_all of the cell polygons). Dissolves an ARRAY<LONG> of cells into one EWKB MultiPolygon (SRID 4326).

Signature: quadbin_cellunion(cellIds: Column): Column

Returns:

  • Binary EWKB multipolygon (SRID-tagged 4326)

SQL:

SELECT gbx_quadbin_cellunion(
gbx_quadbin_kring(gbx_quadbin_pointascell(0.0, 0.0, 8), 1)
) as union_geom;

quadbin_cellunion_agg

LightweightHeavyweight Grouped-agg UDF

Aggregate-level union: dissolve a column of quadbin cell IDs (grouped per partition) into a single MultiPolygon EWKB. Use this instead of gbx_quadbin_cellunion when your cell IDs are spread across rows rather than already collected into an array.

Lightweight tier (pygx)

Powered by shapely. Grouped aggregate — groupBy(...).agg(gx.quadbin_cellunion_agg("cell")) dissolves a group's cell ids into one EWKB MultiPolygon (SRID 4326).

Signature: quadbin_cellunion_agg(cell: Column): Column

Returns:

  • BINARY EWKB multipolygon (SRID-tagged 4326) representing the dissolved coverage

SQL:

SELECT region, gbx_quadbin_cellunion_agg(cell) AS coverage
FROM grid_cells
GROUP BY region;
Example output
+------+--------+
|region|coverage|
+------+--------+
|... |[BINARY]|
+------+--------+

quadbin_distance

LightweightHeavyweight

Chebyshev (king-move) distance between two quadbin cells at the same resolution.

Lightweight tier (pygx)

Powered by the quadbin package. Chebyshev distance on tile coordinates; mirrors Quadbin.scala (errors if the two cells differ in resolution).

Signature: quadbin_distance(cellA: Column, cellB: Column): Column

Returns:

  • INT cell-step distance

SQL:

SELECT gbx_quadbin_distance(
gbx_quadbin_pointascell(0.0, 0.0, 10),
gbx_quadbin_pointascell(0.0001, 0.0, 10)
) as d;

British National Grid (BNG)

LightweightHeavyweight BNG runs in both tiers. The lightweight tier is the pygx BNG package — powered by a pure-Python port of BNG.scala + shapely for geometry; identical gbx_bng_* SQL names, so it is a drop-in swap. Cell IDs are STRING references; geometry outputs are plain WKB in EPSG:27700 coordinates with no SRID (unlike quadbin's EWKB SRID 4326). Cell-id math (bng_eastnorthasbng, bng_pointascell, bng_distance, bng_kring, bng_polyfill, …) is exact-parity with the heavyweight tier; exact cell-set parity is enforced by the cross-tier parity suite.

The British National Grid is the national coordinate system for Great Britain, based on the Ordnance Survey National Grid (OSGB36). It divides Great Britain into grid squares identified by letter-based prefixes and numeric coordinates.

BNG Structure

  • Grid Squares: 100km x 100km squares identified by two letters (e.g., "TQ" for London, "NT" for Edinburgh)
  • Eastings and Northings: Numeric coordinates within each grid square (EPSG:27700)
  • Resolution Indices: Integer indices 1..6 (1=100km, 2=10km, 3=1km, 4=100m, 5=10m, 6=1m); negative indices select quadrant sub-cells. String keys (e.g. "1km", "100m") are also accepted via BNG.resolutionMap.

BNG Grid Reference Format

BNG references follow the format: [Letters][Eastings][Northings]

Examples:

  • TQ 38 80 — 1km precision (Tower of London area)
  • TQ 3800 8000 — 100m precision
  • TQ 38000 80000 — 10m precision
  • SU 12 34 — Different grid square

Precision Levels

PrecisionGrid SizeExampleUse Case
100000m100km x 100kmTQRegional analysis
10000m10km x 10kmTQ38District-level
1000m1km x 1kmTQ3080Local area analysis
100m100m x 100mTQ308808Neighborhood level
10m10m x 10mTQ30808080Building level
1m1m x 1mTQ3080080800Precise location

Major Grid Squares

Major 100km grid squares in Great Britain:

  • TQ — London area
  • SU — South Hampshire
  • NT — Edinburgh area
  • SD — Lake District
  • ST — Bristol area

Conversion Functions

Functions to convert BNG cell IDs to standard geometry formats.

bng_aswkb

LightweightHeavyweight

Convert a BNG cell ID to Well-Known Binary (WKB) format.

Lightweight tier (pygx)

Powered by a pure-Python port of BNG.scala + shapely. The cell footprint polygon as plain WKB in EPSG:27700 (no SRID).

Signature: bng_aswkb(cellId: Column): Column

Parameters:

  • cellId - BNG cell reference string

Returns:

  • Binary WKB geometry representation

SQL:

SELECT gbx_bng_aswkb('TQ3080') as wkb_geom;
Example output
+--------+
|wkb_geom|
+--------+
|[BINARY]|
+--------+

bng_aswkt

LightweightHeavyweight

Convert a BNG cell ID to Well-Known Text (WKT) format.

Lightweight tier (pygx)

Powered by a pure-Python port of BNG.scala + shapely. The cell footprint as WKT in EPSG:27700 coordinates.

Signature: bng_aswkt(cellId: Column): Column

Parameters:

  • cellId - BNG cell reference string

Returns:

  • String WKT geometry representation

SQL:

SELECT gbx_bng_aswkt('TQ3080') as wkt_geom;
Example output
+---------------+
|wkt_geom |
+---------------+
|POLYGON ((...))|
+---------------+

Core Functions

Fundamental operations on BNG cells.

bng_cellarea

LightweightHeavyweight

Calculate the area of a BNG grid cell.

Lightweight tier (pygx)

Powered by the pure-Python port of BNG.scala. Returns the cell area in square kilometres, matching the heavyweight tier.

Signature: bng_cellarea(cellId: Column): Column

Parameters:

  • cellId - BNG cell reference

Returns:

  • Double representing the cell area in square kilometres

SQL:

SELECT 
'TQ3080' as cell,
gbx_bng_cellarea('TQ3080') as area_km2
FROM locations;
Example output
+------+--------+
|cell |area_km2|
+------+--------+
|TQ3080|1.0 |
+------+--------+

bng_centroid

LightweightHeavyweight

Get the centroid (center point) of a BNG cell.

Lightweight tier (pygx)

Powered by the pure-Python port of BNG.scala + shapely. Cell-center POINT as WKB in EPSG:27700 (no SRID).

Signature: bng_centroid(cellId: Column): Column

Parameters:

  • cellId - BNG cell reference

Returns:

  • Point geometry at cell center

SQL:

SELECT gbx_bng_centroid('TQ3080') as centroid;
Example output
+-----------+
|centroid |
+-----------+
|POINT (...)|
+-----------+

bng_distance

LightweightHeavyweight

Calculate distance between two BNG cells.

Lightweight tier (pygx)

Powered by the pure-Python port of BNG.scala — grid-step distance between two cells, exact-parity with the heavyweight tier.

Signature: bng_distance(cell1: Column, cell2: Column): Column

Parameters:

  • cell1 - First BNG cell reference
  • cell2 - Second BNG cell reference

Returns:

  • Double representing distance in meters

SQL:

SELECT 
gbx_bng_distance('TQ3080', 'TQ3081') as distance_m
Example output
+-----------+
|distance_m |
+-----------+
|1000.0 |
+-----------+

bng_euclideandistance

LightweightHeavyweight

Calculate Euclidean distance between two BNG cells.

Lightweight tier (pygx)

Powered by the pure-Python port of BNG.scala — straight-line distance between two cell centroids, exact-parity with the heavyweight tier.

Signature: bng_euclideandistance(cell1: Column, cell2: Column): Column

Parameters:

  • cell1 - First BNG cell reference
  • cell2 - Second BNG cell reference

Returns:

  • Double representing Euclidean distance in meters

SQL:

SELECT 
gbx_bng_euclideandistance('TQ3080', 'TQ3081') as euclidean_distance_m
Example output
+----------------+
|euclidean_dist_m|
+----------------+
|1414.21 |
+----------------+

Cell Operations

Operations combining multiple cells.

bng_cellintersection

LightweightHeavyweight

Get the intersection of two BNG cells.

Lightweight tier (pygx)

Powered by the pure-Python port of BNG.scala + shapely. Intersects two cell chips, returning the dissolved chip.

Signature: bng_cellintersection(cell1: Column, cell2: Column): Column

Parameters:

  • cell1 - First BNG cell reference
  • cell2 - Second BNG cell reference

Returns:

  • BNG cell ID representing the intersection

SQL:

SELECT 
gbx_bng_cellintersection('TQ3080', 'TQ3081') as intersection_cell
Example output
+------------------+
|intersection_cell |
+------------------+
|TQ3080 |
+------------------+

bng_cellunion

LightweightHeavyweight

Get the union of two BNG cells.

Lightweight tier (pygx)

Powered by the pure-Python port of BNG.scala + shapely. Unions two cell chips, returning the dissolved chip.

Signature: bng_cellunion(cell1: Column, cell2: Column): Column

Parameters:

  • cell1 - First BNG cell reference
  • cell2 - Second BNG cell reference

Returns:

  • BNG cell ID representing the union

SQL:

SELECT 
gbx_bng_cellunion('TQ3080', 'TQ3081') as union_cell
Example output
+-----------+
|union_cells|
+-----------+
|TQ3079 |
+-----------+

Coordinate Conversion Functions

Convert coordinates or geometries to BNG cells.

bng_eastnorthasbng

LightweightHeavyweight

Convert easting/northing coordinates to a BNG cell reference.

Lightweight tier (pygx)

Powered by the pure-Python port of BNG.scala (numpy-vectorized encoder). Eastings/northings must be EPSG:27700 (e.g. 530000, 180000 for London). Bit-identical cell IDs to the heavyweight tier.

Signature: bng_eastnorthasbng(easting: Column, northing: Column, resolution: Column): Column

Parameters:

  • easting - Easting coordinate value
  • northing - Northing coordinate value
  • resolution - BNG resolution: integer index (1–6 or negative for quadrant) or string (e.g. '1km', '100m'). See resolutionMap in BNG.

Returns:

  • String BNG cell reference

SQL:

-- Convert OS Grid Reference coordinates (easting, northing); resolution '1km' or integer 3
SELECT gbx_bng_eastnorthasbng(530000, 180000, '1km') as bng_cell;
Example output
+--------+
|bng_cell|
+--------+
|TQ3080 |
+--------+

bng_pointascell

LightweightHeavyweight

Convert a point geometry to a BNG grid cell. The point must be supplied as WKT or WKB (GeoBrix does not accept native Databricks geometry types).

Lightweight tier (pygx)

Powered by the pure-Python port of BNG.scala + shapely. The point coordinates must be EPSG:27700 eastings/northings (e.g. POINT(530000 180000) for London), not WGS84 lon/lat. Exact-parity cell IDs with the heavyweight tier.

Signature: bng_pointascell(point: Column, resolution: Column): Column

Parameters:

  • point - Point geometry as WKT (string) or WKB (binary). For example, WKT: 'POINT(-0.1278 51.5074)'. Do not use st_point() or other DBR native geometry functions—they return a type GeoBrix does not accept.
  • resolution - BNG resolution: integer index (e.g. 3 for 1 km) or string (e.g. '1km', '100m')

Returns:

  • String BNG cell reference

Python (point as WKT column):

from databricks.labs.gbx.gridx.bng import functions as bx
from pyspark.sql.functions import lit
bx.register(spark)
df = spark.range(1).select(
bx.bng_pointascell(lit("POINT(530000 180000)"), lit("1km")).alias("bng_cell")
)
df.show()

Scala: Pass a WKT or WKB column and BNG resolution (e.g. bx.bng_pointascell(lit("POINT(530000 180000)"), lit("1km")) for BNG coords, or lit(3) for 1 km index). Do not use st_point().

SQL:

-- Point in BNG coordinates (eastings, northings); resolution '1km' for 1 km cell
SELECT gbx_bng_pointascell('POINT(530000 180000)', '1km') as london_cell;
Example output
+-----------+
|london_cell|
+-----------+
|TQ3080 |
+-----------+

K-Ring Functions

Generate neighboring cells using k-ring patterns.

bng_kring

LightweightHeavyweight

Generate a k-ring of cells around a center cell (filled disk).

Lightweight tier (pygx)

Powered by the pure-Python port of BNG.scala. All cells within ring distance k (inclusive of the center). Exact cell-set parity with the heavyweight tier.

Signature: bng_kring(cellId: Column, k: Column): Column

Parameters:

  • cellId - BNG cell reference for center
  • k - Integer ring distance (0 = just center, 1 = center + neighbors, etc.)

Returns:

  • Array of BNG cell references in the k-ring

SQL:

-- Get all cells within 2 rings of center
SELECT
cell_id,
gbx_bng_kring(cell_id, 2) as nearby_cells
FROM locations;
Example output
+-------+-----------------------------+
|cell_id|nearby_cells |
+-------+-----------------------------+
|TQ3080 |[TQ3079, TQ3081, TQ2979, ...]|
+-------+-----------------------------+

bng_kloop

LightweightHeavyweight

Generate a k-loop of cells around a center cell (hollow ring).

Lightweight tier (pygx)

Powered by the pure-Python port of BNG.scala. The cells at exactly ring distance k. Exact cell-set parity with the heavyweight tier.

Signature: bng_kloop(cellId: Column, k: Column): Column

Parameters:

  • cellId - BNG cell reference for center
  • k - Integer ring distance

Returns:

  • Array of BNG cell references at exactly distance k

SQL:

SELECT 
'TQ3080' as center,
gbx_bng_kloop('TQ3080', 2) as kloop_cells
Example output
+------+-------------------------------+
|cell_id|kloop_cells |
+------+-------------------------------+
|TQ3080|[TQ3079, TQ3081, TQ2979, ...] |
+------+-------------------------------+

bng_geomkring

LightweightHeavyweight

Generate a k-ring of cells around a geometry at specified resolution.

Lightweight tier (pygx)

Powered by the pure-Python port of BNG.scala + shapely. Polyfills the geometry then expands by ring distance k. Exact cell-set parity with the heavyweight tier.

Signature: bng_geomkring(geom: Column, resolution: Column, k: Column): Column

Parameters:

  • geom - Input geometry (any type)
  • resolution - BNG resolution: integer index (e.g. 3 for 1 km) or string (e.g. '1km', '100m')
  • k - Integer ring distance

Returns:

  • Array of BNG cell references

SQL:

SELECT gbx_bng_geomkring(
st_geomfromtext('POLYGON((-0.1 51.5, -0.1 51.6, 0.0 51.6, 0.0 51.5, -0.1 51.5))'),
3, 1
) as kring_cells
Example output
+------+--------------------------------+
|cell_id|geom_kring |
+------+--------------------------------+
|TQ3080|[POLYGON (...), POLYGON (...)] |
+------+--------------------------------+

bng_geomkloop

LightweightHeavyweight

Generate a k-loop of cells around a geometry at specified resolution.

Lightweight tier (pygx)

Powered by the pure-Python port of BNG.scala + shapely. Polyfills the geometry then returns only the cells at exactly ring distance k. Exact cell-set parity with the heavyweight tier.

Signature: bng_geomkloop(geom: Column, resolution: Column, k: Column): Column

Parameters:

  • geom - Input geometry (any type)
  • resolution - BNG resolution: integer index (e.g. 3 for 1 km) or string (e.g. '1km', '100m')
  • k - Integer ring distance

Returns:

  • Array of BNG cell references at exactly distance k

SQL:

SELECT gbx_bng_geomkloop(
st_geomfromtext('POLYGON((-0.1 51.5, -0.1 51.6, 0.0 51.6, 0.0 51.5, -0.1 51.5))'),
3, 1
) as kloop_cells
Example output
+------+--------------------------------+
|cell_id|geom_kloop |
+------+--------------------------------+
|TQ3080|[POLYGON (...), ...] |
+------+--------------------------------+

Tessellation Functions

Fill geometries with grid cells.

bng_polyfill

LightweightHeavyweight

Fill a geometry with BNG cells at specified resolution.

Lightweight tier (pygx)

Powered by the pure-Python port of BNG.scala + shapely. Enumerates the cells covering the geometry, mirroring the heavyweight seed-and-flood-fill. Exact cell-set parity with the heavyweight tier.

Signature: bng_polyfill(geometry: Column, resolution: Column): Column

Parameters:

  • geometry - Input geometry to fill
  • resolution - BNG resolution: integer index (e.g. 3 for 1 km) or string (e.g. '1km', '100m')

Returns:

  • Array of BNG cell IDs covering the geometry

SQL:

-- Fill a polygon with 1km cells
SELECT
region_name,
gbx_bng_polyfill(boundary, 3) as cells
FROM regions;
Example output
+-----------+-------------------+
|region_name|cells |
+-----------+-------------------+
|London |[TQ3079, TQ3080,..]|
+-----------+-------------------+

bng_tessellate

LightweightHeavyweight

Tessellate a geometry into BNG cells with their geometries.

Lightweight tier (pygx)

Powered by the pure-Python port of BNG.scala + shapely. Polyfill plus a per-cell core/border split, returning one STRUCT<cellid, core, chip> per cell (chip geometry as WKB in EPSG:27700, no SRID). Border chips are filtered to the input geometry type, matching the heavyweight tier.

Signature: bng_tessellate(geometry: Column, resolution: Column): Column

Parameters:

  • geometry - Input geometry to tessellate
  • resolution - BNG resolution: integer index (e.g. 3 for 1 km) or string (e.g. '1km', '100m')

Returns:

  • Array of structs containing cell ID and geometry

SQL:

SELECT gbx_bng_tessellate(
st_geomfromtext('POLYGON((-0.1 51.5, -0.1 51.6, 0.0 51.6, 0.0 51.5, -0.1 51.5))'),
3
) as tessellation
Example output
+----------------------------------------+
|cell_info |
+----------------------------------------+
|{cellId=TQ3080, wkb=[BINARY], ...} |
+----------------------------------------+

Aggregator Functions

Aggregate multiple cells into a single result.

bng_cellintersection_agg

LightweightHeavyweight Grouped-agg UDF

Aggregate intersection of multiple BNG cells.

Lightweight tier (pygx)

Powered by the pure-Python port of BNG.scala + shapely. A grouped aggregate — groupBy(cellid).agg(gx.bng_cellintersection_agg("chip")) intersects a group's chips (all belonging to one cell) into the dissolved chip geometry.

Lightweight and heavyweight return different types

Heavyweight gbx_bng_cellintersection_agg returns a chip STRUCT<cellid, core, chip>; the lightweight function returns BINARY (the dissolved chip geometry, WKB in EPSG:27700) in both Python and SQL. A PySpark grouped-aggregate pandas_udf cannot return a StructType, so the lightweight aggregate emits the chip geometry — the one meaningful payload of a dissolved aggregate. The aggregate only ever combines chips from a single cell, so the heavyweight struct's cellid is exactly the group key you GROUP BY, and core is recoverable from whether the chip equals the full cell — neither carries information beyond the BINARY chip. To reconstruct the heavyweight STRUCT shape in SQL, project the group key as cellid:

-- Lightweight SQL: rebuild the (cellid, chip) the heavyweight struct would carry
SELECT
group_key AS cellid,
gbx_bng_cellintersection_agg(chip) AS chip
FROM cells
GROUP BY group_key

Signature: bng_cellintersection_agg(cellId: Column): Column

Parameters:

  • cellId - BNG cell reference column to aggregate

Returns:

  • BNG cell ID representing common intersection

SQL:

-- Find common cell across groups
SELECT
group_id,
gbx_bng_cellintersection_agg(cell_id) as common_cell
FROM observations
GROUP BY group_id;
Example output
+--------+-----------+
|group_id|common_cell|
+--------+-----------+
|1 |TQ3080 |
+--------+-----------+

bng_cellunion_agg

LightweightHeavyweight Grouped-agg UDF

Aggregate union of multiple BNG cells.

Lightweight tier (pygx)

Powered by the pure-Python port of BNG.scala + shapely. A grouped aggregate — groupBy(cellid).agg(gx.bng_cellunion_agg("chip")) dissolves a group's chips (all belonging to one cell) into the unioned chip geometry.

Lightweight and heavyweight return different types

Heavyweight gbx_bng_cellunion_agg returns a chip STRUCT<cellid, core, chip>; the lightweight function returns BINARY (the dissolved chip geometry, WKB in EPSG:27700) in both Python and SQL. A PySpark grouped-aggregate pandas_udf cannot return a StructType, so the lightweight aggregate emits the chip geometry — the one meaningful payload of a dissolved aggregate. The aggregate only ever combines chips from a single cell, so the heavyweight struct's cellid is exactly the group key you GROUP BY, and core is recoverable from whether the chip equals the full cell — neither carries information beyond the BINARY chip. To reconstruct the heavyweight STRUCT shape in SQL, project the group key as cellid:

-- Lightweight SQL: rebuild the (cellid, chip) the heavyweight struct would carry
SELECT
group_key AS cellid,
gbx_bng_cellunion_agg(chip) AS chip
FROM cells
GROUP BY group_key

Signature: bng_cellunion_agg(cellId: Column): Column

Parameters:

  • cellId - BNG cell reference column to aggregate

Returns:

  • BNG cell ID representing bounding union

SQL:

SELECT 
region,
gbx_bng_cellunion_agg(cell_id) as bounding_cell
FROM observations
GROUP BY region;
Example output
+------+-------------+
|region|bounding_cell|
+------+-------------+
|South |TQ3080 |
+------+-------------+

Generator Functions

Explode array results into individual rows.

bng_kringexplode

LightweightHeavyweight Streaming UDTF

Explode k-ring cells into separate rows.

Lightweight tier (pygx)

Powered by the pure-Python port of BNG.scala as a streaming UDTF — one output row per k-ring cell. Exact cell-set parity with the heavyweight tier.

Signature: bng_kringexplode(cellId: Column, k: Column): Column

Parameters:

  • cellId - BNG cell reference for center
  • k - Integer ring distance

Returns:

  • Exploded rows, one per cell in k-ring

SQL:

SELECT 
'TQ3080' as center_cell,
explode(gbx_bng_kring('TQ3080', 2)) as nearby_cell
Example output
+------+---------+
|cell_id|neighbor|
+------+---------+
|TQ3080|TQ3079 |
|TQ3080|TQ3081 |
+------+---------+

bng_kloopexplode

LightweightHeavyweight Streaming UDTF

Explode k-loop cells into separate rows.

Lightweight tier (pygx)

Powered by the pure-Python port of BNG.scala as a streaming UDTF — one output row per k-loop cell. Exact cell-set parity with the heavyweight tier.

Signature: bng_kloopexplode(cellId: Column, k: Column): Column

Parameters:

  • cellId - BNG cell reference for center
  • k - Integer ring distance

Returns:

  • Exploded rows, one per cell in k-loop

SQL:

SELECT 
'TQ3080' as center_cell,
explode(gbx_bng_kloop('TQ3080', 2)) as ring_cell
Example output
+------+---------+
|cell_id|neighbor|
+------+---------+
|TQ3080|TQ3079 |
|TQ3080|TQ3081 |
+------+---------+

bng_geomkringexplode

LightweightHeavyweight Streaming UDTF

Explode geometry k-ring cells into separate rows.

Lightweight tier (pygx)

Powered by the pure-Python port of BNG.scala + shapely as a streaming UDTF — one output row per geometry k-ring cell. Exact cell-set parity with the heavyweight tier.

Signature: bng_geomkringexplode(geom: Column, resolution: Column, k: Column): Column

Parameters:

  • geom - Input geometry
  • resolution - BNG resolution: integer index (e.g. 3 for 1 km) or string (e.g. '1km', '100m')
  • k - Integer ring distance

Returns:

  • Exploded rows, one per cell

SQL:

SELECT explode(gbx_bng_geomkring(
st_geomfromtext('POINT(-0.1278 51.5074)'), 3, 1
)) as cell
Example output
+------+--------+
|cell_id|geom |
+------+--------+
|TQ3080|POLYGON |
+------+--------+

bng_geomkloopexplode

LightweightHeavyweight Streaming UDTF

Explode geometry k-loop cells into separate rows.

Lightweight tier (pygx)

Powered by the pure-Python port of BNG.scala + shapely as a streaming UDTF — one output row per geometry k-loop cell. Exact cell-set parity with the heavyweight tier.

Signature: bng_geomkloopexplode(geom: Column, resolution: Column, k: Column): Column

Parameters:

  • geom - Input geometry
  • resolution - BNG resolution: integer index (e.g. 3 for 1 km) or string (e.g. '1km', '100m')
  • k - Integer ring distance

Returns:

  • Exploded rows, one per cell at distance k

SQL:

SELECT explode(gbx_bng_geomkloop(
st_geomfromtext('POINT(-0.1278 51.5074)'), 3, 1
)) as cell
Example output
+------+--------+
|cell_id|geom |
+------+--------+
|TQ3080|POLYGON |
+------+--------+

bng_tessellateexplode

LightweightHeavyweight Streaming UDTF

Explode tessellated cells into separate rows.

Lightweight tier (pygx)

Powered by the pure-Python port of BNG.scala + shapely as a streaming UDTF — one output row per tessellated cell, each carrying the cell ID and chip geometry (WKB in EPSG:27700, no SRID).

Signature: bng_tessellateexplode(geometry: Column, resolution: Column): Column

Parameters:

  • geometry - Input geometry to tessellate
  • resolution - BNG resolution: integer index (e.g. 3 for 1 km) or string (e.g. '1km', '100m')

Returns:

  • Exploded rows with cell ID and geometry for each cell

SQL:

SELECT explode(gbx_bng_tessellate(
st_geomfromtext('POLYGON((-0.1 51.5, -0.1 51.6, 0.0 51.6, 0.0 51.5, -0.1 51.5))'),
3
)) as cell_info
Example output
+----------------------------------------+
|cell_info |
+----------------------------------------+
|{cellId=TQ3080, wkb=[BINARY], ...} |
+----------------------------------------+

BNG Reference Format

Standard Format

BNG references follow: [Letters][Eastings][Northings]

Common Resolutions

ResolutionGrid SizeExampleUse Case
100000m100km × 100kmTQRegional
10000m10km × 10kmTQ38District
1000m1km × 1kmTQ3080Local area
100m100m × 100mTQ308808Neighborhood
10m10m × 10mTQ30808080Building
1m1m × 1mTQ3080080800Precise location

Major Grid Squares

Major 100km grid squares in Great Britain:

  • TQ - London area
  • SU - South Hampshire
  • NT - Edinburgh area
  • SD - Lake District
  • ST - Bristol area

Custom Grid Functions

LightweightHeavyweight Custom-grid functions run in both tiers. The lightweight tier is the pygx custom-grid package — powered by a pure-Python port of the custom-grid system + shapely for geometry; identical gbx_custom_* SQL names, so it is a drop-in swap (no external grid library, unlike quadbin's quadbin package). Cell IDs are BIGINT values; geometry outputs are plain WKB with no SRID — the grid's srid is metadata only. Geometry inputs accept WKB, EWKB, WKT, or EWKT in both tiers. Cell-id math and cell sets are exact-parity with the heavyweight tier, enforced by the cross-tier parity suite. gbx_custom_grid returns the same descriptor struct in both tiers.

A custom grid is a user-defined regular rectangular grid specified by its spatial extent, root cell size, and a recursive split factor. Cell IDs are BIGINT values; hierarchy is controlled by cell_splits (each level subdivides root cells into cell_splits x cell_splits sub-cells). Use custom grids when neither BNG nor quadbin matches your coordinate reference system or cell-size requirements — for example, a national grid in EPSG:27700 with non-standard tile sizes.

custom_grid

LightweightHeavyweight

Define a user-specified regular grid from an origin, extent, cell size, split factor, and SRID.

Signature: gbx_custom_grid(boundXMin, boundXMax, boundYMin, boundYMax, cellSplits, rootCellSizeX, rootCellSizeY, srid)

Parameters:

  • boundXMin — minimum X bound of the grid extent
  • boundXMax — maximum X bound of the grid extent
  • boundYMin — minimum Y bound of the grid extent
  • boundYMax — maximum Y bound of the grid extent
  • cellSplits — number of splits per axis at each resolution level (e.g. 2 = 2x2 = 4 sub-cells per step)
  • rootCellSizeX — root cell width in CRS units
  • rootCellSizeY — root cell height in CRS units
  • srid — spatial reference ID (e.g. 27700 for BNG)

Returns:

  • STRUCT<bound_x_min, bound_x_max, bound_y_min, bound_y_max, cell_splits, root_cell_size_x, root_cell_size_y, srid> — a grid descriptor struct passed to all other gbx_custom_* functions.

SQL:

SELECT gbx_custom_grid(0, 1000000, 0, 1000000, 2, 1000, 1000, 27700) AS grid;
Example output
+----------------------------------------------+
|grid |
+----------------------------------------------+
|{0, 1000000, 0, 1000000, 2, 1000, 1000, 27700}|
+----------------------------------------------+

custom_pointascell

LightweightHeavyweight

Index a point geometry into a custom grid cell ID at the specified resolution level.

Signature: gbx_custom_pointascell(point, grid, resolution)

Parameters:

  • point — point geometry as WKT (STRING) or WKB (BINARY) in the grid's CRS
  • grid — custom grid descriptor returned by gbx_custom_grid
  • resolution — resolution level (integer; 0 = root cells, higher = finer)

Returns:

  • BIGINT cell ID encoding the grid position at the given resolution

SQL:

SELECT gbx_custom_pointascell(geom, gbx_custom_grid(0, 1000000, 0, 1000000, 2, 1000, 1000, 27700), 5) AS cell FROM points;
Example output
+----------+
|cell |
+----------+
|8444249301|
+----------+

custom_cellaswkb

LightweightHeavyweight

Return the WKB footprint polygon of a custom grid cell.

Signature: gbx_custom_cellaswkb(cell, grid)

Parameters:

  • cellBIGINT cell ID
  • grid — custom grid descriptor returned by gbx_custom_grid

Returns:

  • BINARY WKB polygon representing the cell boundary

SQL:

SELECT gbx_custom_cellaswkb(cell, gbx_custom_grid(0, 1000000, 0, 1000000, 2, 1000, 1000, 27700)) AS geom FROM cells;
Example output
+--------+
|geom |
+--------+
|[BINARY]|
+--------+

custom_cellaswkt

LightweightHeavyweight

Return the WKT footprint polygon of a custom grid cell.

Signature: gbx_custom_cellaswkt(cell, grid)

Parameters:

  • cellBIGINT cell ID
  • grid — custom grid descriptor returned by gbx_custom_grid

Returns:

  • STRING WKT polygon representing the cell boundary

SQL:

SELECT gbx_custom_cellaswkt(cell, gbx_custom_grid(0, 1000000, 0, 1000000, 2, 1000, 1000, 27700)) AS wkt FROM cells;
Example output
+---------------------------------------------------------------+
|wkt |
+---------------------------------------------------------------+
|POLYGON ((530000 180000, 530031.25 180000, 530031.25 180031.25,|
|530000 180031.25, 530000 180000)) |
+---------------------------------------------------------------+

custom_centroid

LightweightHeavyweight

Return the centroid of a custom grid cell as a WKB point.

Signature: gbx_custom_centroid(cell, grid)

Parameters:

  • cellBIGINT cell ID
  • grid — custom grid descriptor returned by gbx_custom_grid

Returns:

  • BINARY WKB point at the cell center (in the grid's CRS)

SQL:

SELECT gbx_custom_centroid(cell, gbx_custom_grid(0, 1000000, 0, 1000000, 2, 1000, 1000, 27700)) AS centroid FROM cells;
Example output
+--------+
|centroid|
+--------+
|[BINARY]|
+--------+

custom_polyfill

LightweightHeavyweight

Fill a geometry with all custom grid cell IDs at the specified resolution.

Signature: gbx_custom_polyfill(geom, grid, resolution)

Parameters:

  • geom — input geometry as WKT (STRING) or WKB (BINARY) in the grid's CRS
  • grid — custom grid descriptor returned by gbx_custom_grid
  • resolution — resolution level (integer; higher = finer cells)

Returns:

  • ARRAY<BIGINT> of cell IDs whose footprints intersect the geometry

SQL:

SELECT region_id, gbx_custom_polyfill(geom, gbx_custom_grid(0, 1000000, 0, 1000000, 2, 1000, 1000, 27700), 5) AS cells FROM regions;
Example output
+---------+-----------------------------------------+
|region_id|cells |
+---------+-----------------------------------------+
|R-01 |[8444249301, 8444249302, 8444249567, ...]|
+---------+-----------------------------------------+

custom_kring

LightweightHeavyweight

Return all custom grid cells within k steps of a center cell (filled neighborhood, Chebyshev distance).

Signature: gbx_custom_kring(cell, grid, k)

Parameters:

  • cellBIGINT center cell ID
  • grid — custom grid descriptor returned by gbx_custom_grid
  • k — integer ring distance (0 = center only; 1 = 3x3 neighborhood including center)

Returns:

  • ARRAY<BIGINT> of cell IDs within distance k (up to (2k+1)^2 cells)

SQL:

SELECT gbx_custom_kring(cell, gbx_custom_grid(0, 1000000, 0, 1000000, 2, 1000, 1000, 27700), 1) AS ring FROM cells;
Example output
+----------+------------------------------------------------------------+
|cell |ring |
+----------+------------------------------------------------------------+
|8444249301|[8444248813, 8444248814, 8444248815, 8444249300, 8444249301,|
| |8444249302, 8444249789, 8444249790, 8444249791] |
+----------+------------------------------------------------------------+

Performance Tips

1. Use Appropriate Resolution

Choose resolution based on analysis needs:

  • Coarse (10km) for regional analysis
  • Medium (1km) for local patterns
  • Fine (100m) for detailed studies

2. Leverage Aggregators

Use aggregator functions for efficient grouping:

SELECT 
region,
gbx_bng_cellunion_agg(cell_id) as bounding_cell
FROM observations
GROUP BY region;
Example output
+------+-------------+
|region|bounding_cell|
+------+-------------+
|South |TQ3080 |
+------+-------------+

3. Use Generators for Expansion

Generator functions (e.g. bng_kringexplode, bng_kloopexplode) are more efficient than using explode on array results. See the generator examples above.


Next Steps