Skip to main content

GeoPackage Reader

The GeoPackage reader provides support for reading OGC GeoPackage format, a modern SQLite-based geospatial format.

Format Name

gpkg_ogr

Overview

This is a named OGR Reader that uses the GPKG driver. GeoPackage is an open, standards-based, platform-independent, portable, self-describing format for transferring geospatial information.

Key Features

  • Self-contained: Single-file SQLite database
  • Multi-layer: Can contain multiple vector layers and raster tiles
  • Attributes: Full attribute support with data types
  • Spatial Index: Built-in spatial indexing
  • Portable: Cross-platform compatibility

Basic Usage

Python

# Read GeoPackage (sample-data Volumes path)
df = spark.read.format("gpkg_ogr").load("/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/geopackage/nyc_complete.gpkg")
df.show()
Example output
+--------------------+--------------+---------+
|shape |shape_srid |BoroName |
+--------------------+--------------+---------+
|[BINARY] |4326 |Manhattan|
|... |... |... |
+--------------------+--------------+---------+

Scala

val df = spark.read.format("gpkg_ogr").load("/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/geopackage/nyc_complete.gpkg")
Example output
+--------------------+--------------+---------+
|shape |shape_srid |BoroName |
+--------------------+--------------+---------+
|[BINARY] |4326 |Manhattan|
|... |... |... |
+--------------------+--------------+---------+

SQL

-- Read GeoPackage in SQL (sample-data Volumes path)
SELECT * FROM gpkg_ogr.`/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/geopackage/nyc_complete.gpkg`;
Example output
+--------------------+--------------+---------+
|shape |shape_srid |BoroName |
+--------------------+--------------+---------+
|[BINARY] |4326 |... |
|... |... |... |
+--------------------+--------------+---------+

Output Schema

The output maintains attribute columns and adds geometry columns. Note that GeoPackage typically uses shape as the geometry column name:

root
|-- shape: binary (geometry in WKB format)
|-- shape_srid: integer (spatial reference ID)
|-- shape_srid_proj: string (projection definition)
|-- <attribute_columns>: various types

Options

Multi-Layer Support

GeoPackage files can contain multiple layers. Use the layerName option to specify which layer to read:

# Read specific layer (sample-data Volumes path)
boroughs = spark.read.format("gpkg_ogr") \
.option("layerName", "boroughs") \
.load("/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/geopackage/nyc_complete.gpkg")
boroughs.show()
Example output
+--------------------+--------------+---------+
|shape |shape_srid |BoroName |
+--------------------+--------------+---------+
|[BINARY] |4326 |... |
|... |... |... |
+--------------------+--------------+---------+

Other Options

All OGR reader options are available, e.g.:

  • chunkSize - Records per chunk (default: "10000")
  • asWKB - Output as WKB vs WKT (default: "true")
  • layerName - Specific layer to read
  • layerN - Layer index to read (0-based)

Next Steps