Skip to main content

File GeoDatabase Reader

Read ESRI File Geodatabase (.gdb) format - the standard multi-layer geospatial format used in ArcGIS.

Format Name

file_gdb_ogr

Overview

This is a named OGR Reader that uses the OpenFileGDB driver. File Geodatabases are directory-based formats that can contain multiple feature classes (layers), making them ideal for complex geospatial datasets.

Basic Usage

Python

# Read File Geodatabase (sample-data Volumes path)
df = spark.read.format("file_gdb_ogr").load("/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/filegdb/NYC_Sample.gdb.zip")
df.show()
Example output
+--------------------+--------------+---------+
|SHAPE |SHAPE_srid |BoroName |
+--------------------+--------------+---------+
|[BINARY] |4326 |... |
|... |... |... |
+--------------------+--------------+---------+

Scala

// Read File Geodatabase (.zip; sample data is distributed as NYC_Sample.gdb.zip)
|val df = spark.read.format("file_gdb_ogr").load("/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/filegdb/NYC_Sample.gdb.zip")
Example output
+--------------------+--------------+---------+
|SHAPE |SHAPE_srid |BoroName |
+--------------------+--------------+---------+
|[BINARY] |4326 |... |
|... |... |... |
+--------------------+--------------+---------+

SQL

-- Read File Geodatabase in SQL (sample-data Volumes path)
SELECT * FROM file_gdb_ogr.`/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/filegdb/NYC_Sample.gdb.zip` LIMIT 10;
Example output
+--------------------+--------------+---------+
|SHAPE |SHAPE_srid |BoroName |
+--------------------+--------------+---------+
|[BINARY] |4326 |... |
|... |... |... |
+--------------------+--------------+---------+

Options

layerName

Default: "" (first feature class)

Specify which feature class (layer) to read from the geodatabase:

# Read specific feature class (sample-data Volumes path)
df = spark.read.format("file_gdb_ogr") \
.option("layerName", "NYC_Boroughs") \
.load("/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/filegdb/NYC_Sample.gdb.zip")
df.show()
Example output
+--------------------+--------------+---------+
|SHAPE |SHAPE_srid |BoroName |
+--------------------+--------------+---------+
|[BINARY] |4326 |... |
|... |... |... |
+--------------------+--------------+---------+

Other Options

All OGR reader options are available, e.g.:

  • layerN - Layer index to read (0-based)
  • chunkSize - Records per chunk (default: "10000")
  • asWKB - Output as WKB vs WKT (default: "true")

Output Schema

File Geodatabases typically use SHAPE as the geometry column name:

root
|-- SHAPE: binary (geometry in WKB format)
|-- SHAPE_srid: integer (spatial reference ID)
|-- SHAPE_srid_proj: string (projection definition)
|-- <attribute_columns>: various types
Column Names

Column names in File Geodatabases are case-insensitive.

Key Features

  • Multi-Layer: Contains multiple feature classes (layers)
  • Rich Attributes: Full attribute support with domains and subtypes
  • Topology: Can include topology rules (read-only)
  • ArcGIS Native: The standard format for ESRI ArcGIS

Use Cases

  • ArcGIS Migration: Moving from ArcGIS workflows to Databricks
  • Enterprise Geodata: Reading complex organizational geospatial datasets
  • Multi-Layer Datasets: Working with related feature classes in one file
  • Attribute-Rich Data: Preserving domains, subtypes, and relationships

Limitations

Read-Only Access

The OpenFileGDB driver provides read-only access. You cannot create, modify, or write File Geodatabases using this reader.

  • Supports File Geodatabase versions 9.x and later
  • Topology and relationship rules are read-only
  • Some advanced ArcGIS features may not be fully supported

Next Steps