Business intelligence
A well-architected BI integration on Databricks anchors on governed access through Unity Catalog, high-performance serving with SQL Warehouses, and consistent semantics via UC Metric Views and Table metadata. Partners should also adopt:
- OAuth for authentication
- Query tags for attribution
- CloudFetch for performance
- Bring-your-own lineage for traceability
- AI Functions for enhanced analytics
- Genie for agentic, natural-language experiences
SQL serving layer
Requirements
- Use of Databricks SQL warehouses as the dedicated BI serving engine as it is purpose built to provide scalable, low-latency, high-concurrency performance.
- An option for query and predicate pushdown.
- Graceful handling of extended startup times of classic warehouses (i.e your application shouldn't timeout in error with your default connection settings).
- Recommend serverless warehouses over classic warehouses in your documentation.
Documentation: SQL Warehouse Types
Best practices
- Use Databricks drivers for managed large result retrieval and caching (CloudFetch).
Documentation: CloudFetch | Query Caching
Supported but not preferred
- SQL Execution API or SDKs require explicit handling for large result retrieval via
EXTERNAL_LINKS. - For asynchronous execution, consider the JDBC driver instead.
Documentation: SQL Execution API | JDBC Driver
Anti-patterns
- Keeping a warehouse warm has serious cost implications. Use the Warehouse API to check state instead.
Documentation: Warehouse API
Python code execution
Python execution needs fall into two patterns: interactive (latency-sensitive) and non-interactive (latency-tolerant).
Requirements
Interactive use cases
- Use Databricks Connect for Python with serverless compute for on-demand execution where users expect responsive results.
Non-interactive use cases
- Use the Jobs API with serverless compute for asynchronous execution where latency is acceptable.
Documentation: Databricks Connect | Jobs API
Not eligible for validation
- Command Execution API only supports classic compute with long start-up times.
Centralized data governance & metadata consumption
Treat Unity Catalog as the single semantic and descriptive source of truth for modeling and exploration.
Requirements
- Adopt Unity Catalog as the authoritative layer for permissions, identities, auditing, lineage, and data governance through U2M OAuth.
- For long-lived orchestration tasks such as scheduled extracts, use Service Principal OAuth.
Best practices
- Token Federation and Workload Identity Federation are the preferred OAuth patterns.
- Use table/column definitions, primary/foreign key relationships, and constraints to drive optimized logical data modeling.
- Read table and column comments to surface business-friendly labels, tooltips, and documentation in your UI.
- Consume tags for domains, PII/sensitivity, regulatory flags, and "certified" or "gold" indicators.
Documentation: Authentication and Authorization | Metadata & Access Control
Centrally governed business semantics
Unity Catalog metric views require special handling and explicit configuration in the driver connection for metadata reads.
Requirements
Eliminate semantic drift through the following practices:
- Read metric view metadata through driver function calls, SQL SHOW/DESCRIBE commands, or Information Schema.
- Make measures and attributes available in their native data type.
- Compile queries using Databricks MEASURE function for performance-optimized reads with dynamic materializations.
Best practices
- Push updates to metric views through Data Engineering practices.
Documentation: Metric Views
Workload attribution
Best practices
- Allow customers to define query tags for each warehouse connection or dashboard for operational visibility and cost attribution.
Documentation: Query Tags
Partner-provided lineage
Best practices
- Create external metadata objects and relationships for your platform and dashboards using the External Metadata API for end-to-end analytical traceability.
Documentation: Data Lineage | External Lineage
AI-extended BI functions
AI Functions are built-in Databricks SQL functions that call LLMs and ML models directly in queries for NLQ, summarization, and reasoning.
Best practices
- Wrap and expose Databricks AI Functions through your platform's functions for NLQ, summarization, forecasting, and automated insights within existing workflows.
Documentation: AI Functions | AI Capabilities
End-to-end agentic AI
Genie provides user-defined spaces containing data assets and instructions, enabling natural language queries without SQL.
Best practices
- Extend the BI experience by integrating with Genie.
- List available Genie spaces through the API so the BI tool can surface and invoke them.
- Programmatically create spaces via the API when one doesn't exist for a dataset.
Documentation: Genie Spaces | Genie API
What's next
- Review the integration requirements for foundational guidance
- Learn about telemetry and attribution for usage tracking
- Explore other Partner product categories for additional integration patterns