File GeoDatabase Reader
Read ESRI File Geodatabase (.gdb) format - the standard multi-layer geospatial format used in ArcGIS.
Format Name
file_gdb_ogr
Overview
This is a named OGR Reader that uses the OpenFileGDB driver. File Geodatabases are directory-based formats that can contain multiple feature classes (layers), making them ideal for complex geospatial datasets.
Basic Usage
Python
# Read File Geodatabase (sample-data Volumes path)
df = spark.read.format("file_gdb_ogr").load("/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/filegdb/NYC_Sample.gdb.zip")
df.show()
Example output
+--------------------+--------------+---------+
|SHAPE |SHAPE_srid |BoroName |
+--------------------+--------------+---------+
|[BINARY] |4326 |... |
|... |... |... |
+--------------------+--------------+---------+
Scala
// Read File Geodatabase (.zip; sample data is distributed as NYC_Sample.gdb.zip)
|val df = spark.read.format("file_gdb_ogr").load("/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/filegdb/NYC_Sample.gdb.zip")
Example output
+--------------------+--------------+---------+
|SHAPE |SHAPE_srid |BoroName |
+--------------------+--------------+---------+
|[BINARY] |4326 |... |
|... |... |... |
+--------------------+--------------+---------+
SQL
-- Read File Geodatabase in SQL (sample-data Volumes path)
SELECT * FROM file_gdb_ogr.`/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/filegdb/NYC_Sample.gdb.zip` LIMIT 10;
Example output
+--------------------+--------------+---------+
|SHAPE |SHAPE_srid |BoroName |
+--------------------+--------------+---------+
|[BINARY] |4326 |... |
|... |... |... |
+--------------------+--------------+---------+
Options
layerName
Default: "" (first feature class)
Specify which feature class (layer) to read from the geodatabase:
# Read specific feature class (sample-data Volumes path)
df = spark.read.format("file_gdb_ogr") \
.option("layerName", "NYC_Boroughs") \
.load("/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/filegdb/NYC_Sample.gdb.zip")
df.show()
Example output
+--------------------+--------------+---------+
|SHAPE |SHAPE_srid |BoroName |
+--------------------+--------------+---------+
|[BINARY] |4326 |... |
|... |... |... |
+--------------------+--------------+---------+
Other Options
All OGR reader options are available, e.g.:
layerN- Layer index to read (0-based)chunkSize- Records per chunk (default: "10000")asWKB- Output as WKB vs WKT (default: "true")
Output Schema
File Geodatabases typically use SHAPE as the geometry column name:
root
|-- SHAPE: binary (geometry in WKB format)
|-- SHAPE_srid: integer (spatial reference ID)
|-- SHAPE_srid_proj: string (projection definition)
|-- <attribute_columns>: various types
Column Names
Column names in File Geodatabases are case-insensitive.
Key Features
- Multi-Layer: Contains multiple feature classes (layers)
- Rich Attributes: Full attribute support with domains and subtypes
- Topology: Can include topology rules (read-only)
- ArcGIS Native: The standard format for ESRI ArcGIS
Use Cases
- ArcGIS Migration: Moving from ArcGIS workflows to Databricks
- Enterprise Geodata: Reading complex organizational geospatial datasets
- Multi-Layer Datasets: Working with related feature classes in one file
- Attribute-Rich Data: Preserving domains, subtypes, and relationships
Limitations
Read-Only Access
The OpenFileGDB driver provides read-only access. You cannot create, modify, or write File Geodatabases using this reader.
- Supports File Geodatabase versions 9.x and later
- Topology and relationship rules are read-only
- Some advanced ArcGIS features may not be fully supported