Skip to main content

Scale & Limits

Scaling Databricks effectively requires understanding where limits are enforced and how they impact design, operations, and automation. Limits exist at three layers—workspace, account, and cloud provider—and each layer governs different aspects of the platform.

Reference implementation: See how Firefly implements scalability across application tier auto-scaling, Databricks Serverless SQL, and workspace isolation patterns.

Workspace-Level Limits

Most Databricks operational limits live at the workspace level. These define the practical boundaries for day-to-day development and compute operations.

Workspace-level limits include:

  • Maximum clusters, jobs, and SQL warehouses
  • Concurrency limits for job runs, queries, and interactive compute
  • Workspace object limits (notebooks, dashboards, folders)
  • API rate limits for workspace-scoped operations
  • PAT/token limits, user/group mappings

Resource Limits by Cloud:

CloudDocumentation
AWSResource Limits
GCPResource Limits
AzureResource Limits

Design Implications

Workspace limits shape how teams organize workloads, separate environments, and design automation pipelines. For Partner Hosted SaaS architectures:

  • Capacity planning: If individual customers will generate hundreds of concurrent jobs, thousands of notebooks, or high API throughput, plan for workspace-per-tenant architecture from the start to avoid hitting workspace-level limits
  • API rate limits: Workspace-scoped rate limits provide natural isolation—multiple workspaces distribute API load across separate rate limit buckets

See the Boundary Table below for a complete view of which limits apply at workspace vs account vs cloud provider levels.

Account-Level Limits

Account-level limits apply across all workspaces in a Databricks account and are especially relevant for governance, metadata, and compute quotas.

Account-level limits include:

  • Unity Catalog metastore limits: catalogs, schemas, tables, volumes, grants
  • Resource quotas controlling workspace or compute usage
  • Serverless compute quotas
  • Account-level configuration APIs
  • Identity integrations and cross-workspace governance

Key Resources:

ResourceDocumentation
Unity Catalog Resource QuotasREST API
Managing Resource QuotasDocumentation
Serverless Compute QuotasDocumentation

These limits matter most when organizations scale to large catalogs, many workspaces, high API throughput, or strong governance requirements.

Cloud Provider Limits

Databricks inherits many constraints from the underlying cloud provider (AWS, Azure, GCP). These limits often determine compute scalability and provisioning behavior.

Cloud-level limits include:

  • VM and GPU quotas
  • Regional capacity availability
  • IP address, ENI, VNet/VPC, and subnet limits
  • Cloud API throttling and provisioning rate limits
  • Storage and networking throughput behavior

Examples of cloud constraints:

  • Cluster provisioning failures due to insufficient VM quota
  • SQL warehouse scaling impacted by regional GPU/CPU availability
  • Networking issues caused by IP exhaustion or ENI limits

These limitations are not Databricks-specific but strongly influence scaling behavior on the platform.

Cloud Provider Resources:

ProviderNetworking LimitsStorage Limits
AWSVPC LimitsS3 Limits
AzureVNet LimitsStorage Limits
GCPVPC LimitsStorage Limits

Design Implications

Cloud provider constraints often surface during high-growth or production scaling. For Partner Hosted SaaS architectures:

  • Regional capacity: Consider multi-region deployment if you need high availability or your scale exceeds what a single region can support. GPU availability in particular can be limited in specific regions.
  • Networking limits: For high-scale deployments, plan for IP address space and ENI limits upfront—especially critical for Classic workspaces where networking resources are provisioned in your cloud account.

Boundary Table

CategoryWorkspaceAccountCloud Provider
Compute & JobsYesPartial
SQL WarehousesYesPartial
Workspace ObjectsYes
Unity Catalog ObjectsYes
Resource QuotasYes
Serverless QuotasYesPartial
Identity & AccessYesPartial
API LimitsPartialPartial
Networking & ProvisioningYes
StoragePartialPartial
Monitoring & AuditPartialYes

How to Think About Scaling

When evaluating scale, consider the interaction of all three boundary layers:

Metadata Complexity

How many objects (tables, schemas, volumes, grants) you create and how quickly they grow.

Operational Throughput

How frequently clusters, jobs, pipelines, and queries are triggered.

Provisioning Dynamics

How fast compute must start during peak load—governed by both workspace and cloud infrastructure.

Governance Load

How user/group structures and permissions expand as more teams adopt the platform.

Workspace Density

How many teams and workloads coexist in a single workspace and how that affects limits.

What's Next