File GeoDatabase Reader

Read ESRI File Geodatabase (.gdb) format - the standard multi-layer geospatial format used in ArcGIS.

Format Name

file_gdb_ogr

Overview

This is a named OGR Reader that uses the OpenFileGDB driver. File Geodatabases are directory-based formats that can contain multiple feature classes (layers), making them ideal for complex geospatial datasets.

Basic Usage

Python

# Read File Geodatabase (sample-data Volumes path)
df = spark.read.format("file_gdb_ogr").load("/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/filegdb/NYC_Sample.gdb.zip")
df.show()

Example output
+--------------------+--------------+---------+
|SHAPE               |SHAPE_srid    |BoroName |
+--------------------+--------------+---------+
|[BINARY]            |4326          |...      |
|...                 |...           |...      |
+--------------------+--------------+---------+

Scala

// Read File Geodatabase (.zip; sample data is distributed as NYC_Sample.gdb.zip)
      |val df = spark.read.format("file_gdb_ogr").load("/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/filegdb/NYC_Sample.gdb.zip")

Example output
+--------------------+--------------+---------+
|SHAPE               |SHAPE_srid    |BoroName |
+--------------------+--------------+---------+
|[BINARY]            |4326          |...      |
|...                 |...           |...      |
+--------------------+--------------+---------+

SQL

-- Read File Geodatabase in SQL (sample-data Volumes path)
SELECT * FROM file_gdb_ogr.`/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/filegdb/NYC_Sample.gdb.zip` LIMIT 10;

Example output
+--------------------+--------------+---------+
|SHAPE               |SHAPE_srid    |BoroName |
+--------------------+--------------+---------+
|[BINARY]            |4326          |...      |
|...                 |...           |...      |
+--------------------+--------------+---------+

Options

`layerName`

Default: "" (first feature class)

Specify which feature class (layer) to read from the geodatabase:

# Read specific feature class (sample-data Volumes path)
df = spark.read.format("file_gdb_ogr") \
    .option("layerName", "NYC_Boroughs") \
    .load("/Volumes/main/default/geobrix_samples/geobrix-examples/nyc/filegdb/NYC_Sample.gdb.zip")
df.show()

Example output
+--------------------+--------------+---------+
|SHAPE               |SHAPE_srid    |BoroName |
+--------------------+--------------+---------+
|[BINARY]            |4326          |...      |
|...                 |...           |...      |
+--------------------+--------------+---------+

Other Options

All OGR reader options are available, e.g.:

layerN - Layer index to read (0-based)
chunkSize - Records per chunk (default: "10000")
asWKB - Output as WKB vs WKT (default: "true")

Output Schema

File Geodatabases typically use SHAPE as the geometry column name:

root
 |-- SHAPE: binary (geometry in WKB format)
 |-- SHAPE_srid: integer (spatial reference ID)
 |-- SHAPE_srid_proj: string (projection definition)
 |-- <attribute_columns>: various types

Column Names

Column names in File Geodatabases are case-insensitive.

Key Features

Multi-Layer: Contains multiple feature classes (layers)
Rich Attributes: Full attribute support with domains and subtypes
Topology: Can include topology rules (read-only)
ArcGIS Native: The standard format for ESRI ArcGIS

Use Cases

ArcGIS Migration: Moving from ArcGIS workflows to Databricks
Enterprise Geodata: Reading complex organizational geospatial datasets
Multi-Layer Datasets: Working with related feature classes in one file
Attribute-Rich Data: Preserving domains, subtypes, and relationships

Limitations

Read-Only Access

The OpenFileGDB driver provides read-only access. You cannot create, modify, or write File Geodatabases using this reader.

Supports File Geodatabase versions 9.x and later
Topology and relationship rules are read-only
Some advanced ArcGIS features may not be fully supported

Format Name​

Overview​

Basic Usage​

Python​

Scala​

SQL​

Options​

layerName​

Other Options​

Output Schema​

Key Features​

Use Cases​

Limitations​

Next Steps​