This feature is in Private Preview. These docs are shared under NDA and must not be redistributed. Feature behavior, APIs, and requirements may change before General Availability.
Building Databricks Apps for Marketplace
Best practices for ISVs and data providers building closed-source Databricks Apps for distribution via Databricks Marketplace.
Marketplace apps are installed by consumers into their workspaces — workspaces you don't control. That changes the bar on auth, networking, configuration, supply chain, and packaging. This guide collects the practices that survive that constraint, grounded in Databricks public docs and field experience.
1. Authentication and identity
Prefer the App Service Principal over PATs — always
Every Databricks App is automatically provisioned a dedicated service principal (SP). Credentials are injected as DATABRICKS_CLIENT_ID and DATABRICKS_CLIENT_SECRET. Use the SDK with these — never PATs. PATs leak, expire on rotation, and don't carry the per-app identity that the consumer's workspace admin needs to govern.
from databricks.sdk import WorkspaceClient
w = WorkspaceClient() # picks up DATABRICKS_CLIENT_ID / SECRET automatically
Use On-Behalf-Of-User (OBO) for anything touching consumer data
When the app reads or writes the consumer's data, route through the user's identity. Databricks forwards the user token in the x-forwarded-access-token HTTP header. OBO ensures Unity Catalog row filters, column masks, and table ACLs are enforced.
Rule of thumb:
| Identity | When to use |
|---|---|
| App SP | App-owned operations: telemetry, internal jobs, calling your own services |
| OBO (user token) | Any access to the consumer's UC data, warehouses, or serving endpoints |
Known SDK gotcha when combining OBO with WorkspaceClient
Passing the forwarded user token to WorkspaceClient(token=user_token) will fail with more than one authorization method configured: oauth and pat because the SDK assumes token= means PAT.
Patterns that work:
- For SQL: pass the token directly to
sql.connect(..., access_token=request.headers.get("x-forwarded-access-token")) - For SDK calls under OBO: construct a fresh client with the OAuth token explicitly using
Config(host=..., token=user_token, auth_type="oauth-m2m")or usedatabricks.sdk.config.Configwith credential strategy override
Test against the docs example before shipping.
Declare the minimum scopes
Default scopes are intentionally narrow (iam.access-control:read, iam.current-user:read). Request only what the app actually needs. Consumers' security reviews flag broad scopes, and install conversion drops every time you add one.
Multi-tenancy and user isolation
For OBO apps, implement proper user isolation. Two users hitting the same app instance must not see each other's data. Don't cache OBO-scoped query results in process-global state.
Never log or persist tokens
Ensure that tokens are not printed, logged, or written to files. Strip auth headers from any error reporting or trace export.
Audit-log OBO actions
For every action you take on behalf of a user, record a structured log line: user identity, action, target resource, status. Consumers' compliance teams will ask.
2. Resources and configuration
Declare everything in app.yaml
Your app declares resources (warehouses, secrets, serving endpoints, jobs, Genie spaces, UC tables/volumes) so the consumer's admin can review and bind them at install time. Hardcoding workspace IDs, warehouse IDs, or table paths breaks portability across consumer workspaces.
command: ['streamlit', 'run', 'app.py']
env:
- name: DATABRICKS_WAREHOUSE_ID
valueFrom: sql_warehouse # bound by consumer admin at install
- name: API_KEY
valueFrom: my-secret # bound to a consumer-managed secret
Use valueFrom, not value, for any secret. value is plain text in your bundle.
Use the consumer's secret scope, not yours
Any API keys, license keys, or third-party credentials your app depends on should bind to a consumer-owned secret scope at install time — not be baked in from your dev environment. Declare a secret resource in app.yaml; the consumer's admin populates it with their own credentials at install.
This ensures:
- The consumer manages rotation and revocation on their own schedule
- You never see or store their keys
- Each consumer uses their own provider account (Stripe, OpenAI, etc.) — important for billing isolation and quota separation
Document in your install instructions exactly which secrets the consumer must provide and where to obtain them.
Don't share an SP across apps
Each app gets its own SP and that SP cannot be reused. The per-app SP is the unit of governance the consumer operates on.
Apply least privilege per resource
Grant the app SP only the permissions the app needs on the bound resource (e.g., CAN_USE on a warehouse, not CAN_MANAGE). Document what permissions the consumer needs to grant in your install instructions.
3. Build on Databricks-native assets
Your app will run, scale, and govern best when its dependencies stay inside the Databricks ecosystem. Reach for native services first; treat external dependencies as the exception.
| Instead of | Use |
|---|---|
| External Postgres | Lakebase — same operational model, governed by Unity Catalog, no extra egress |
| External model provider | Model Serving and Foundation Model APIs — inference stays inside the consumer's workspace, governed by UC, billed via DBUs |
| External vector DB | Vector Search for embeddings |
| Custom compute infrastructure | Databricks Jobs and DLT for background pipelines |
| External warehouse or object store | Delta tables in Unity Catalog for any analytical outputs |
Every Databricks-native dependency is one fewer external domain to declare for SEG, one fewer credential to manage, one fewer security review with the consumer's team — and your app's value stays entirely inside the lakehouse the customer is already paying for.
4. Networking and external access
Declare every external domain precisely
If your app talks to api.acme.com, cdn.acme.com, or any package registry at runtime, declare those domains in your Marketplace listing's required-egress metadata.
- Consumer workspaces with Serverless Egress Gateway (SEG) policies will block undeclared traffic with no graceful fallback
- The narrower and more reputable your declared domains, the higher your adoption in security-sensitive customers (financial services, healthcare, public sector)
Minimum domains for any app to deploy: *.databricksapps.com, pypi.org, files.pythonhosted.org, registry.npmjs.org (package registries not applicable for scalable apps). AWS deployments often need *.amazonaws.com for S3 / STS.
Understand SEG governance at install time
SEG is a workspace-level setting controlled by account admins — not workspace admins, and not the user installing your app.
At install, Databricks performs a pre-check against the consumer's SEG policy and surfaces the specific domains the app needs that are not allowed. The consumer must contact their account admin out-of-band to update SEG.
Test against a SEG-restricted workspace before shipping
The most common Marketplace install failure mode: app installs successfully but errors on first request because a runtime domain wasn't allowlisted. Reproduce a restricted-egress workspace in QA, run your app's full happy path, and check system.access.outbound_network for any denied connections you missed.
Don't assume internet at runtime
Pre-fetch what you can at build/deploy time. Avoid runtime calls to pip install, npm install, or remote model downloads — they will fail in restricted workspaces and they slow cold starts everywhere.
Plan for runtime-dynamic dependencies (agentic apps)
If your app is agentic and the set of external destinations depends on user prompts (e.g., "talk to my Slack"), pre-declaration is impossible. Design for runtime failures with actionable error messages: surface the specific domain that failed and tell the user to contact their account admin.
5. Source code and updates
Repository standards
- Ship human-readable source code only — no pre-compiled binaries, packaged executables, obfuscated code, or other opaque artifacts
- Host the app in a repository you control with branch protections enabled
- Pin every Marketplace publication to a specific git tag or commit hash, not a moving branch like
main - Include a
SECURITY.mdwith vulnerability reporting procedures - Include a dependency manifest (
requirements.txt,pyproject.toml+ lockfile, orpackage.json+ lockfile) so consumers' security teams can scan
Code review
- Two-person approval for any code change going to Marketplace
- Static analysis / linting in CI; block merge on failures
Update lifecycle
- Communicate breaking changes via a deprecation notice flow consumers can subscribe to
- Decide your update cadence and document it in your listing
6. Deployment
Use git-based deployment, not manual upload
Configure the app against a git repository (GitHub / GitLab / Bitbucket). Each databricks apps deploy reads from a specific branch, tag, or commit, which means your deploy is reproducible and reviewable.
databricks apps create my-app \
--json '{"git_repository":{"url":"https://github.com/acme/forecast","provider":"gitHub"}}'
databricks apps deploy my-app \
--json '{"git_source":{"branch":"main"}}'
Manual workspace folder uploads (databricks sync + deploy from path) work, but they don't give you the audit trail or rollback story you'll want when a consumer reports a regression.
Use Databricks Asset Bundles (DABs) for environment promotion
Define your app in databricks.yml with targets for dev, staging, and prod. Promote between environments by changing the --target flag, not by editing config.
Pin dependencies
Use requirements.txt with exact pins (or uv.lock / pyproject.toml with a lockfile). A floating dependency is a release that breaks on whatever day PyPI changes upstream — and you won't know which consumer hit it.
Tag every release
Git tag every release that goes to Marketplace. When a consumer files a bug, "what tag are you on" is the first question.
7. Local development and testing
Validate locally before submitting
Use databricks apps run-local to run your app outside the Databricks runtime against a local config. Validate your app.yaml, env injection, and SDK calls before you push to a workspace.
Reproduce the consumer install in a clean workspace
Before submitting to Marketplace QA, install your own app in a workspace with no preexisting setup. Walk through the install path the way a consumer admin would. Most install bugs are caught here — missing permission grants, hardcoded paths, silent assumptions about preexisting catalogs.
Test the OBO and App-SP paths separately
A common mistake: every dev test runs as you (full admin), so OBO failures only surface when a low-privilege consumer user tries the app. Test with a dedicated low-privilege user account.
Test against a SEG-restricted workspace
See Networking. This is the most important pre-submission test for Marketplace.
8. Observability and operations
Send structured logs to UC
Configure app telemetry to write traces, logs, and metrics to Unity Catalog. Consumers' platform teams want to see what your app is doing in their environment without filing a ticket with you.
Expose health-check endpoints
The Apps REST API exposes app_status.state and compute_status.state (GET /api/.../apps/{name}). Document for consumers how to wire that into their monitoring (PagerDuty, Datadog) and what your "healthy" state looks like.
Make errors actionable for the consumer admin
When the app errors, surface a message the consumer's admin can act on: "Required domain api.acme.com is blocked by SEG — contact your account admin." Don't dump a stack trace.
9. Performance and cost
Design for scale-down
App compute is billed per provisioned hour. Configure to scale down between requests if traffic is bursty; don't keep large models or warehouse connections hot when the app is idle.
Manage cold starts
Avoid heavy imports and model downloads at startup. Lazy-load anything that isn't needed on the first request path. Cold start is the consumer's first impression.
Cache provider-side state outside the app
If your app calls your own backend, cache there, not inside the per-consumer app instance. App instances are scoped per workspace; they're a bad place to hold state you'd rather have global.
Marketplace readiness checklist
Before submitting your app for QA:
- Built on a supported framework: Python (Streamlit / Dash / Gradio) or Node.js (React / Angular / Svelte / Express)
- Created a
readme.mdto attach with your app listing - No PATs anywhere in code or config
- App SP and OBO usage matches the rule of thumb
- All scopes are minimal and documented
- All external domains declared precisely in listing metadata — no "whole internet" entries unless you've explicitly accepted the TAM tradeoff
- Verified against a SEG-restricted workspace; install path surfaces actionable messages for blocked domains
- All resources (warehouses, secrets, etc.) declared in
app.yaml, none hardcoded - Secrets bind to consumer-owned scopes at install — no API or license keys baked into the bundle
- Native Databricks services preferred over external equivalents (Lakebase, Model Serving, Vector Search) where possible; external dependencies justified
- Dependencies pinned via lockfile
-
SECURITY.mdpublished; vulnerability reporting path documented - Git-based deploy configured; all Marketplace releases pinned to a specific tag/commit
- Audit logs structured, no tokens in any log path
- Cold start under target threshold (recommend < 10s to first request)
- Health check / status endpoint documented for consumer monitoring teams
- Consumer install flow walked end-to-end in a clean workspace
- Real, functional product — not a POC, demo, or placeholder