Skip to main content

Vector

VizX Overview · Multi-Layer Compositor · PMTiles Viewers

Interactive Maps

For multi-layer interactive maps, see the Multi-Layer Compositor.

plot_interactive is the interactive twin of plot_static — same input model (a geopandas.GeoDataFrame, or a Spark DataFrame via geom_col / grid_system), but it renders a pan/zoom folium map instead of a static matplotlib figure. It is the recommended way to get an interactive map from GeoBrix results: it is scale-safe and Databricks-safe, neither of which a bare GeoDataFrame.explore() is.

  • Scale-safe. A raw .explore() embeds every vertex inline, so a few million vertices hang folium and render a blank map. plot_interactive counts vertices and, above a threshold, automatically falls back to a rasterized image overlay (a viridis PNG laid over the basemap) that scales to millions of vertices.
  • Databricks-safe. It renders via displayHTML internally. Returning a folium map (or calling .explore()) does not auto-render in a Databricks notebook; plot_interactive does the right thing in both Databricks and plain Jupyter.

plot_interactive

plot_interactive(
data, *, column=None, geom_col=None, grid_system=None, grid_conf=None,
max_rows=10_000, sample_seed=None, srid=None, mode="auto",
max_vertices=60_000, max_px=1400, opacity=0.65, debug_level=1, **explore_kw,
)

data is a Spark DataFrame or a geopandas.GeoDataFrame, resolved exactly as plot_static does — set grid_system to decode cell ids ('h3', 'quadbin', 'bng', 'custom' with grid_conf=) to boundary polygons, otherwise the geometry column is read via geom_col (auto-detected if omitted). The map renders as the function's last statement: in Databricks via displayHTML (returns None), in plain Jupyter by returning the folium map (which auto-renders). Extra keyword arguments are forwarded to .explore().

The mode parameter picks the rendering path:

mode valueBehaviour
"auto" (default)Full vector .explore() when the total vertex count is ≤ max_vertices, otherwise the raster image overlay. Announces the chosen path at debug_level >= 1.
"detailed"Always geopandas .explore() — full vector, with per-feature hover / popups. Over max_vertices it warns that rendering may be slow and proceeds anyway.
"fast"Always rasterize the polygons to a PNG folium ImageOverlay — complete and scales to millions of vertices, but a flat image with no per-feature hover.

Parameters:

ParameterTypeDefaultDescription
columnstr or NoneNoneColumn to colour by (choropleth for .explore(); value/category ramp for the overlay).
geom_colstr or NoneNoneGeometry column on a Spark DataFrame (auto-detected if omitted).
grid_systemstr or NoneNoneDecode the column as DGGS cell ids: 'h3', 'quadbin', 'bng', 'custom'.
grid_confRow, dict or NoneNoneGrid spec for grid_system='custom'.
max_rowsint or None10_000Row cap before the Spark → GeoDataFrame collect; None collects everything.
sample_seedint or NoneNoneHow the max_rows cap is filled: None takes the first max_rows rows; an int draws a reproducible seeded sample.
sridint or NoneNoneCRS override for the input geometries.
modestr"auto""auto" / "detailed" / "fast" (see table above).
max_verticesint60_000Auto crossover threshold (and the detailed-mode slow-render warning trigger).
max_pxint1400Longest-edge pixel resolution of the fast raster overlay.
opacityfloat0.65Overlay fill opacity.
debug_levelint10 silent / 1 key decisions + warnings / 2+ verbose internals.
from databricks.labs.gbx.vizx import plot_interactive

# Interactive H3 choropleth straight from a Spark DataFrame — picks the vector
# path for a small cell set, the raster overlay automatically once it's large.
plot_interactive(cells_df, grid_system="h3", column="band_level")

When the input is many small geometries (so the max_rows cap fires on the fast path), plot_interactive advises pre-aggregating with st_union_agg / rst_h3_rasterize_agg so the frame stays few-rows-many-vertices and the cap never truncates coverage.

Static Maps

For compositing multiple layers in a single map, see the Multi-Layer Compositor — it explains the embed-size ladder, Layer types, and the audit/simplify workflow.

plot_static renders Spark- or GeoPandas-derived geometries (or H3 cells) over a basemap as a static matplotlib figure — the GitHub-renderable counterpart to GeoDataFrame.explore() (whose Leaflet/folium output renders a blank "Make this Notebook Trusted" placeholder on GitHub and the docs site).

The basemap is fetched from a web tile server (via contextily) at execution time and rasterized into the figure, so it bakes into the committed notebook output PNG — GitHub then displays it with no network. If the executing environment has no egress, the map renders without a basemap and a warning is emitted (never a hard error).

plot_static

plot_static(
data, *, geom_col=None, grid_system=None, grid_conf=None, column=None,
cmap="viridis", legend=True, basemap=True, basemap_source=None, alpha=0.8,
edgecolor="face", fill=True, markersize=None, title=None, fig_w=10, fig_h=10,
max_rows=10_000, srid=None, ax=None,
)

data is a Spark DataFrame or a geopandas.GeoDataFrame. Returns the matplotlib Axes; pass it back via ax= to overlay layers on one map. Every layer is reprojected to Web Mercator (EPSG:3857), so a basemap=False overlay lines up with a basemap layer on the same axes. plot_static does not call pyplot.show() — in a notebook the figure auto-displays at cell end with all overlaid layers; a script can call plt.show() itself. Pass fill=False to draw geometries as outlines only (no face), so a boundary doesn't cover the layers beneath it.

Geometry columns accept the same encodings as every other gbx_st_* function — WKT, EWKT, WKB, EWKB, and native GEOMETRY / GEOGRAPHY (coerced in-Spark via st_asbinary). Set grid_system to treat the column as DGGS cell ids instead:

grid_systemBehaviour
None (default)Column is a geometry encoding (WKT/EWKT/WKB/EWKB/GEOMETRY/GEOGRAPHY).
'h3'H3 cell ids (string index or bigint) → cell-boundary polygons (lon/lat).
'quadbin'Quadbin cell ids (bigint) → tile-boundary polygons (lon/lat).
'bng'British National Grid cell ids (string, e.g. "TQ38") → cell polygons in EPSG:27700.
'custom'Custom-grid cell ids → cell polygons; requires grid_conf= (the grid spec that defines the grid). CRS comes from the grid's srid.

Cell boundaries reuse the lightweight GridX cell→geometry implementations, so they match gbx_h3_* / gbx_quadbin_aswkb / gbx_bng_aswkb / gbx_custom_cellaswkb exactly. Each grid's native CRS is honoured (BNG is metres in EPSG:27700) and reprojected to Web Mercator for the basemap. A custom grid whose srid is <= 0 has no CRS, so its basemap is skipped (with a warning); pass basemap=False.

For a custom grid, grid_conf is the grid-spec Row/dict — the same struct the custom GridX functions consume (bound_x_min/bound_x_max, bound_y_min/bound_y_max, cell_splits, root_cell_size_x/root_cell_size_y, srid):

plot_static(cells_df, grid_system="custom", grid_conf=grid_spec, column="value")
from databricks.labs.gbx.vizx import plot_static

# H3 choropleth over a basemap, then overlay the shared-canvas boundary as a
# red outline (fill=False so it doesn't cover the cells; basemap=False so it
# doesn't re-fetch tiles). Both layers reproject to 3857, so they align.
ax = plot_static(cells_df, grid_system="h3", column="count", title="Coverage")
plot_static(grid_gdf, ax=ax, fill=False, edgecolor="red", basemap=False)

basemap_source overrides the default contextily.providers.CartoDB.Positron; basemap=False skips tiles entirely (deterministic, no network).

Adapters

The vector adapters collect to the driver for interactive mapping. By default they cap the collect at max_rows=10_000 and emit a warning if the input has more rows (pass max_rows=None to collect everything — at your own risk on large frames). The returned GeoDataFrame is in EPSG:4326, ready for .plot() (matplotlib) or — for an interactive map — plot_interactive.

For interactive (pan/zoom) maps, prefer plot_interactive over a raw GeoDataFrame.explore(): it is scale-safe (bare .explore() embeds every vertex inline, so millions of vertices hang folium and render blank) and Databricks-safe (it renders via displayHTML, whereas a returned folium map does not auto-render in a Databricks notebook). Calling gdf.explore() directly still works for small in-memory frames in plain Jupyter.

as_gdf

as_gdf(df, wkt_col="wkt", *, max_rows=10_000)

Convert a Spark DataFrame with a WKT geometry column into a geopandas.GeoDataFrame. Non-geometry columns are preserved; the WKT column is replaced by the geometry:

from databricks.labs.gbx.vizx import as_gdf

gdf = as_gdf(df_with_wkt, wkt_col="wkt")
gdf.explore() # interactive folium map (needs the [vizx] extra)

cells_as_gdf

cells_as_gdf(df, cell_col="cellid", extra_cols=(), *, max_rows=10_000, dissolve_by=None)

Convert a DataFrame of H3 cell ids into a GeoDataFrame of cell-boundary polygons (boundaries computed with the h3 library). Carry through attribute columns with extra_cols.

Parameters:

ParameterTypeDefaultDescription
dfDataFrameSpark DataFrame with an H3 cell-id column.
cell_colstr"cellid"Name of the column containing H3 cell ids (bigint).
extra_colstuple or list()Additional columns to carry through to the GeoDataFrame.
max_rowsint or None10_000Row cap before the collect; None collects all rows.
dissolve_bystr or NoneNoneWhen set, dissolve cell polygons by this column, returning one footprint polygon per distinct value instead of one row per cell. Must be in extra_cols.

Without dissolve_by, each row represents one cell (useful for per-cell tooltips in .explore()). With dissolve_by, rows are merged per group into a single union footprint — far fewer geometries for large cell sets:

from databricks.labs.gbx.vizx import cells_as_gdf

# Per-cell choropleth:
gdf = cells_as_gdf(df_cells, cell_col="cellid", extra_cols=["count"])
gdf.explore(column="count") # choropleth (needs mapclassify, bundled in [vizx])

# Dissolve to one footprint polygon per category:
gdf_dissolved = cells_as_gdf(
df_cells,
cell_col="cellid",
extra_cols=["category"],
dissolve_by="category",
)
gdf_dissolved.explore()

grid_as_gdf

grid_as_gdf(grid, srid=None)

Convert a grid spec — as returned by rst_h3_gridspec — into a 1-row GeoDataFrame of its bounding-box rectangle in EPSG:4326. Compose with cells_as_gdf(...).explore() to overlay the shared canvas boundary over the H3 cells it contains.

Parameters:

ParameterTypeDefaultDescription
gridRow, dictA Spark Row or dict with xmin, ymin, xmax, ymax fields — the struct that rst_h3_gridspec returns in its grid field.
sridint or NoneNoneCRS override. Falls back to the grid's own srid field; if both are absent, EPSG:4326 is assumed. When the source CRS is not 4326, the bounding box is reprojected via pyproj.

Optional metadata fields pixel_size, width, and height are carried through if present on the input.

from databricks.labs.gbx.vizx import cells_as_gdf, grid_as_gdf

# grid_row is the 'grid' field from an rst_h3_gridspec result
grid_gdf = grid_as_gdf(grid_row)
cells_gdf = cells_as_gdf(df_cells, cell_col="cellid", extra_cols=["count"])

# Compose: H3 cells as a choropleth with the canvas boundary drawn on top
m = cells_gdf.explore(column="count")
grid_gdf.explore(m=m, color="black", style_kwds={"fill": False})

See the H3 rasterize notebook for a worked example pairing grid_as_gdf with cells_as_gdf to build a shared-canvas overlay.