D2D sharing
Databricks-to-Databricks (D2D) sharing lets you share data securely with any Databricks user, regardless of account or cloud, as long as they have access to a Unity Catalog-enabled workspace. This page covers best practices for sharing both structured and unstructured data. For step-by-step setup instructions, see Share data using D2D protocol.
- Native integration with Unity Catalog for governance and auditing
- Simplified setup using sharing identifiers (no credential files to manage)
- Auditing and usage tracking for both providers and recipients
- Support for all asset types: tables, views, volumes, notebooks, models, and MCP
- Improved read performance with history sharing
- Seamless experience within the Databricks ecosystem
Best practices for structured data
Structured data includes tables and views shared through Delta Sharing.
Table history and performance
- Prefer sharing tables WITH HISTORY for improved read performance (default on DBR >= 16.2)
- Partitioned tables do not gain performance benefits from history sharing
- Enable Change Data Feed (CDF) before sharing WITH HISTORY if recipients need incremental consumption,
table_changes(), time travel, or streaming - Be aware that history sharing grants recipients access to the Delta log, which includes commit history and deleted data not yet vacuumed
Table features and compatibility
- For tables with deletion vectors or column mapping, share WITH HISTORY and ensure recipients use DBR >= 14.1
- For streaming workloads, share streaming tables or tables WITH HISTORY to allow Structured Streaming and time travel on the consumer side
- When sharing federated data, prefer materialized views rather than foreign tables for performance and cost efficiency
- Validate table feature compatibility (CDF, DVs, column mapping, V2 metadata) on both provider and recipient compute versions
Partitioning and filtering
- Use partitioned sharing to deliver only relevant subsets of data
- For fine-grained access, share dynamic views for row/column filtering
- Do not share tables that already have row filters or masks applied as this filtering is not applied to the share
- Recipients may apply their own row filters or column masks once the data has landed on their side
Example of partitioned sharing:
ALTER SHARE my_share
ADD TABLE catalog.schema.sales
PARTITION (region = 'us-west', year >= 2024);
Example of a dynamic view with row-level filtering:
CREATE VIEW catalog.schema.sales_filtered AS
SELECT * FROM catalog.schema.sales
WHERE region = current_recipient('region');
Governance boundaries
- ACLs and lineage do not propagate across metastores
- Recipients can only be granted read access to shared objects
- Manage governance separately on provider and recipient sides
- Maintain least-privilege ownership of shares
- Assign share ownership to a group rather than an individual to ensure continuity
Egress and performance
- Plan for egress when sharing across regions or clouds
- Providers may incur egress fees when recipients read data
For mitigation strategies (CDF, Cloudflare R2, replication), see Egress considerations.
D2D operations
- Use system tables (
deltaSharingQueriedTable,generateTemporaryTableCredentials) for monitoring share read patterns - If using CDF, limit aggressive VACUUM schedules to avoid removing files needed for CDF, time travel, or history-sharing
- Consider liquid clustering for large or append-heavy shared tables
For general operations guidance (change management, health checks, governance), see the Operations runbook.
Metadata and data labeling
Good comments and consistent tags increase searchability and semantic understanding for humans and AI:
- Well-documented metadata improves natural-language-to-SQL generation (Genie) and data discovery
- High-quality metadata supports AI-ready data products and reduces ambiguity in downstream pipelines
- Use AI-generated comments as a starting point, but require human validation before saving
- Update comments as part of schema evolution to prevent metadata drift
For a detailed checklist, see AI readiness.
Additional asset types
Volumes
See Add volumes to a share for setup instructions.
When to use:
- Use volumes for non-tabular assets: PDFs, images, video, audio, binaries, logs, JSON, code files (.sh, .py)
- Use volumes for AI artifacts: training data, embeddings, model weights, vector indexes
- Prefer tables for structured or relational access patterns where SQL semantics and indexing matter
Constraints:
- Volume sharing is D2D only (not available for open sharing)
- Shared volumes are read-only (recipients cannot write, delete, or modify)
- Recipients require READ VOLUME privilege to access files
Documentation:
- Add volume comments describing content, purpose, and expected structure
- Treat volumes as governed Unity Catalog objects with UC-managed sharing, privileges, and lineage
What's next
- Learn about D2O sharing patterns for external platforms
- Learn about O2D sharing patterns for consuming external shares
- Explore bi-directional sharing for two-way collaboration
- Set up dynamic views for fine-grained access control
- Configure monitoring to track sharing activity