D2D sharing

Databricks-to-Databricks (D2D) sharing lets you share data securely with any Databricks user, regardless of account or cloud, as long as they have access to a Unity Catalog-enabled workspace. This page covers best practices for sharing both structured and unstructured data. For step-by-step setup instructions, see Share data using D2D protocol.

Native integration with Unity Catalog for governance and auditing
Simplified setup using sharing identifiers (no credential files to manage)
Auditing and usage tracking for both providers and recipients
Support for all asset types: tables, views, volumes, notebooks, models, and MCP
Improved read performance with history sharing
Seamless experience within the Databricks ecosystem

Best practices for structured data

Structured data includes tables and views shared through Delta Sharing.

Table history and performance

Prefer sharing tables WITH HISTORY for improved read performance (default on DBR >= 16.2)
Partitioned tables do not gain performance benefits from history sharing
Enable Change Data Feed (CDF) before sharing WITH HISTORY if recipients need incremental consumption, table_changes(), time travel, or streaming
Be aware that history sharing grants recipients access to the Delta log, which includes commit history and deleted data not yet vacuumed

Table features and compatibility

For tables with deletion vectors or column mapping, share WITH HISTORY and ensure recipients use DBR >= 14.1
For streaming workloads, share streaming tables or tables WITH HISTORY to allow Structured Streaming and time travel on the consumer side
When sharing federated data, prefer materialized views rather than foreign tables for performance and cost efficiency
Validate table feature compatibility (CDF, DVs, column mapping, V2 metadata) on both provider and recipient compute versions

Partitioning and filtering

Use partitioned sharing to deliver only relevant subsets of data
For fine-grained access, share dynamic views for row/column filtering
Do not share tables that already have row filters or masks applied as this filtering is not applied to the share
Recipients may apply their own row filters or column masks once the data has landed on their side

Example of partitioned sharing:

ALTER SHARE my_share 
ADD TABLE catalog.schema.sales 
PARTITION (region = 'us-west', year >= 2024);

Example of a dynamic view with row-level filtering:

CREATE VIEW catalog.schema.sales_filtered AS
SELECT * FROM catalog.schema.sales
WHERE region = current_recipient('region');

Governance boundaries

ACLs and lineage do not propagate across metastores
Recipients can only be granted read access to shared objects
Manage governance separately on provider and recipient sides
Maintain least-privilege ownership of shares
Assign share ownership to a group rather than an individual to ensure continuity

Egress and performance

Plan for egress when sharing across regions or clouds
Providers may incur egress fees when recipients read data

For mitigation strategies (CDF, Cloudflare R2, replication), see Egress considerations.

D2D operations

Use system tables (deltaSharingQueriedTable, generateTemporaryTableCredentials) for monitoring share read patterns
If using CDF, limit aggressive VACUUM schedules to avoid removing files needed for CDF, time travel, or history-sharing
Consider liquid clustering for large or append-heavy shared tables

For general operations guidance (change management, health checks, governance), see the Operations runbook.

Metadata and data labeling

Good comments and consistent tags increase searchability and semantic understanding for humans and AI:

Well-documented metadata improves natural-language-to-SQL generation (Genie) and data discovery
High-quality metadata supports AI-ready data products and reduces ambiguity in downstream pipelines
Use AI-generated comments as a starting point, but require human validation before saving
Update comments as part of schema evolution to prevent metadata drift

For a detailed checklist, see AI readiness.

Additional asset types

Volumes

See Add volumes to a share for setup instructions.

When to use:

Use volumes for non-tabular assets: PDFs, images, video, audio, binaries, logs, JSON, code files (.sh, .py)
Use volumes for AI artifacts: training data, embeddings, model weights, vector indexes
Prefer tables for structured or relational access patterns where SQL semantics and indexing matter

Constraints:

Volume sharing is D2D only (not available for open sharing)
Shared volumes are read-only (recipients cannot write, delete, or modify)
Recipients require READ VOLUME privilege to access files

Documentation:

Add volume comments describing content, purpose, and expected structure
Treat volumes as governed Unity Catalog objects with UC-managed sharing, privileges, and lineage

What's next

Learn about D2O sharing patterns for external platforms
Learn about O2D sharing patterns for consuming external shares
Explore bi-directional sharing for two-way collaboration
Set up dynamic views for fine-grained access control
Configure monitoring to track sharing activity

Best practices for structured data​

Table history and performance​

Table features and compatibility​

Partitioning and filtering​

Governance boundaries​

Egress and performance​

D2D operations​

Metadata and data labeling​

Additional asset types​

Volumes​

What's next​