Skip to main content

D2D sharing

Databricks-to-Databricks (D2D) sharing lets you share data securely with any Databricks user, regardless of account or cloud, as long as they have access to a Unity Catalog-enabled workspace. This page covers best practices for sharing both structured and unstructured data. For step-by-step setup instructions, see Share data using D2D protocol.

  • Native integration with Unity Catalog for governance and auditing
  • Simplified setup using sharing identifiers (no credential files to manage)
  • Auditing and usage tracking for both providers and recipients
  • Support for all asset types: tables, views, volumes, notebooks, models, and MCP
  • Improved read performance with history sharing
  • Seamless experience within the Databricks ecosystem

Best practices for structured data

Structured data includes tables and views shared through Delta Sharing.

Table history and performance

Table features and compatibility

  • For tables with deletion vectors or column mapping, share WITH HISTORY and ensure recipients use DBR >= 14.1
  • For streaming workloads, share streaming tables or tables WITH HISTORY to allow Structured Streaming and time travel on the consumer side
  • When sharing federated data, prefer materialized views rather than foreign tables for performance and cost efficiency
  • Validate table feature compatibility (CDF, DVs, column mapping, V2 metadata) on both provider and recipient compute versions

Partitioning and filtering

  • Use partitioned sharing to deliver only relevant subsets of data
  • For fine-grained access, share dynamic views for row/column filtering
  • Do not share tables that already have row filters or masks applied as this filtering is not applied to the share
  • Recipients may apply their own row filters or column masks once the data has landed on their side

Example of partitioned sharing:

ALTER SHARE my_share 
ADD TABLE catalog.schema.sales
PARTITION (region = 'us-west', year >= 2024);

Example of a dynamic view with row-level filtering:

CREATE VIEW catalog.schema.sales_filtered AS
SELECT * FROM catalog.schema.sales
WHERE region = current_recipient('region');

Governance boundaries

  • ACLs and lineage do not propagate across metastores
  • Recipients can only be granted read access to shared objects
  • Manage governance separately on provider and recipient sides
  • Maintain least-privilege ownership of shares
  • Assign share ownership to a group rather than an individual to ensure continuity

Egress and performance

  • Plan for egress when sharing across regions or clouds
  • Providers may incur egress fees when recipients read data

For mitigation strategies (CDF, Cloudflare R2, replication), see Egress considerations.

D2D operations

  • Use system tables (deltaSharingQueriedTable, generateTemporaryTableCredentials) for monitoring share read patterns
  • If using CDF, limit aggressive VACUUM schedules to avoid removing files needed for CDF, time travel, or history-sharing
  • Consider liquid clustering for large or append-heavy shared tables

For general operations guidance (change management, health checks, governance), see the Operations runbook.

Metadata and data labeling

Good comments and consistent tags increase searchability and semantic understanding for humans and AI:

  • Well-documented metadata improves natural-language-to-SQL generation (Genie) and data discovery
  • High-quality metadata supports AI-ready data products and reduces ambiguity in downstream pipelines
  • Use AI-generated comments as a starting point, but require human validation before saving
  • Update comments as part of schema evolution to prevent metadata drift

For a detailed checklist, see AI readiness.

Additional asset types

Volumes

See Add volumes to a share for setup instructions.

When to use:

  • Use volumes for non-tabular assets: PDFs, images, video, audio, binaries, logs, JSON, code files (.sh, .py)
  • Use volumes for AI artifacts: training data, embeddings, model weights, vector indexes
  • Prefer tables for structured or relational access patterns where SQL semantics and indexing matter

Constraints:

  • Volume sharing is D2D only (not available for open sharing)
  • Shared volumes are read-only (recipients cannot write, delete, or modify)
  • Recipients require READ VOLUME privilege to access files

Documentation:

  • Add volume comments describing content, purpose, and expected structure
  • Treat volumes as governed Unity Catalog objects with UC-managed sharing, privileges, and lineage

What's next