OGR Reader

The OGR reader provides generic support for reading vector data formats through the OGR library. This is the base reader that powers all vector format readers in GeoBrix.

Format Name

ogr

Overview

The OGR reader is a generic vector data reader that can handle any format supported by OGR/GDAL. While GeoBrix provides named readers for common formats (Shapefile, GeoJSON, GeoPackage, etc.), you can use the OGR reader directly for any available format.

Available Formats

The OGR reader can work with many OGR vector drivers, including:

ESRI Shapefile (.shp)
GeoJSON (.geojson, .json)
GeoPackage (.gpkg)
File Geodatabase (.gdb)
KML (.kml)
GML (.gml)
CSV with geometry (.csv)
PostgreSQL/PostGIS
And 80+ more formats

Format Availability

Experience varies across GDAL formats. Not all formats are available by default—some require additional packages or drivers to be installed in your environment.

Basic Usage

Python

# OGR reader (sample-data Volumes path)
df = spark.read.format("ogr").load("/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/boroughs/nyc_boroughs.geojson")
df.show()

Example output
+--------------------+-----------+-----+
|geom_0              |geom_0_srid|...  |
+--------------------+-----------+-----+
|[BINARY]            |4326       |...  |
|...                 |...        |...  |
+--------------------+-----------+-----+

Scala

val df = spark.read.format("ogr").load("/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/boroughs/nyc_boroughs.geojson")

Example output
+--------------------+-----------+-----+
|geom_0              |geom_0_srid|...  |
+--------------------+-----------+-----+
|[BINARY]            |4326       |...  |
|...                 |...        |...  |
+--------------------+-----------+-----+

SQL

-- Read with OGR in SQL (sample-data Volumes path)
SELECT * FROM ogr.`/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/boroughs/nyc_boroughs.geojson`;

Example output
+--------------------+-----------+-----+
|geom_0              |geom_0_srid|...  |
+--------------------+-----------+-----+
|[BINARY]            |4326       |...  |
|...                 |...        |...  |
+--------------------+-----------+-----+

Options

`driverName`

Default: Auto-detected from file extension if not specified

Explicitly specify the OGR driver to use (regardless of extension).

# Explicit driver (sample-data Volumes path)
df = spark.read.format("ogr") \
    .option("driverName", "GeoJSON") \
    .load("/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/boroughs/nyc_boroughs.geojson")
df.show()

Example output
+--------------------+-----------+-----+
|geom_0              |geom_0_srid|...  |
+--------------------+-----------+-----+
|[BINARY]            |4326       |...  |
|...                 |...        |...  |
+--------------------+-----------+-----+

Other Options

Option	Default	Description
`chunkSize`	`"10000"`	Number of records per chunk for parallel reading
`layerN`	`"0"`	Layer index for multi-layer formats (0-based)
`layerName`	`""`	Layer name for multi-layer formats (overrides `layerN`)
`asWKB`	`"true"`	Output geometry as WKB (binary) vs WKT (text)

Output Schema

root
 |-- geom_0: binary (geometry in WKB format)
 |-- geom_0_srid: integer (spatial reference ID)
 |-- geom_0_srid_proj: string (projection definition)
 |-- <attribute_1>: <type> (feature attributes...)
 |-- <attribute_2>: <type>
 |-- ...

Databricks Integration

OGR (and named vector readers) output geometry in WKB format. To use with Databricks spatial functions, convert to GEOMETRY type. Example uses the Shapefile reader and sample-data Volumes path; the same pattern applies to any OGR-based reader.

Requires Databricks Runtime

These examples use st_geomfromwkb to convert GeoBrix WKB to Databricks GEOMETRY type.

Convert to GEOMETRY

# Convert WKB to Databricks GEOMETRY type
df = spark.read.format("shapefile_ogr").load("/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/subway/nyc_subway.shp.zip")
df_with_geom = df.select("*", expr("st_geomfromwkb(geom_0)").alias("geometry"))

SQL Example

-- Read shapefile and convert to GEOMETRY in SQL
CREATE OR REPLACE TEMP VIEW stations AS
SELECT *, st_geomfromwkb(geom_0) as geometry
FROM shapefile_ogr.`/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/subway/nyc_subway.shp.zip`;

SELECT name, geometry FROM stations LIMIT 10;

Named Readers vs OGR

For common formats, GeoBrix provides named readers for convenience (sample-data Volumes path):

# Named reader (recommended for common formats)
df = spark.read.format("shapefile_ogr").load("/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/subway/nyc_subway.shp.zip")
# OGR with explicit driver (same result)
df = spark.read.format("ogr").option("driverName", "ESRI Shapefile").load("/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/subway/nyc_subway.shp.zip")

When to use each:

Named readers (shapefile, geojson, ogr_gpkg, file_gdb): Better for common formats, cleaner syntax
OGR: Useful for less common formats or when you need OGR-specific options

Format Name​

Overview​

Available Formats​

Basic Usage​

Python​

Scala​

SQL​

Options​

driverName​

Other Options​

Output Schema​

Databricks Integration​

Convert to GEOMETRY​

SQL Example​

Named Readers vs OGR​

Next Steps​