STAC Client
databricks.labs.gbx.stac.StacClient is a lightweight, Serverless-safe client for distributed STAC search, resilient asset download, and repair of invalid files — against any STAC catalog (default: Planetary Computer).
Where a single-node STAC script serializes search requests and downloads, StacClient fans both operations out across the Spark cluster — one task per AOI row (search) and one task per asset (download). On Serverless, parallelism is controlled via partitions= and DataFrame.repartition(), with no spark.conf.set calls.
StacClient requires geobrix[light,stac]. The [stac] extra pulls in pystac-client, planetary-computer, tenacity, and requests. Serverless environment version 5 (Python 3.12) is required.
Ready-made downloaders
For common sources you usually don't call StacClient directly — the gbx.sample package ships AOI-driven downloaders that wrap it (or the same distributed STAC pattern) with source-specific defaults, each following a discover → download → read flow:
| Downloader | Source | STAC backing |
|---|---|---|
Overture Maps Downloader — OvertureClient | Overture Maps (buildings, places, …) | Overture's static STAC catalog via the overturemaps CLI |
NAIP Aerial Imagery Downloader — NaipDownloader | NAIP 1 m aerial imagery (US) | StacClient on Planetary Computer |
3DEP Downloader (DEM) — DemDownloader | USGS 3DEP elevation (US) | StacClient on Planetary Computer |
Reach for StacClient directly when you need a catalog these don't cover, or full control over the search → download → repair flow described below.
Installation
pip install "geobrix[light,stac]"
From a Databricks notebook (Serverless or classic):
%pip install --quiet "geobrix[light,stac] @ file:///Volumes/<catalog>/<schema>/<volume>/geobrix-0.4.0-py3-none-any.whl"