
Working with Unity Catalog Volumes
Source:vignettes/working-with-volumes.Rmd
working-with-volumes.Rmdbrickster includes two groups of volume functions:
-
db_uc_volumes_*: manage volume objects in Unity Catalog (create, list, update, delete) -
db_volume_*: work with files and directories inside an existing volume
In most day-to-day workflows, you will spend most of your time with
db_volume_* once a volume already exists.
Which Function to Use
This table is focused on filesystem operations inside an existing volume:
| Operation | Scope | Function | Notes |
|---|---|---|---|
| Upload file | Single file | db_volume_write() |
Upload one local file to a volume path |
| Download file | Single file | db_volume_read() |
Download one volume file to local disk |
| Delete file | Single file | db_volume_delete() |
Remove one file from a volume |
| Check file exists | Single file | db_volume_file_exists() |
Returns TRUE/FALSE
|
| List contents | Directory | db_volume_list() |
Lists files/subdirectories for a directory |
| Create directory | Directory | db_volume_dir_create() |
Creates target directory path |
| Check directory exists | Directory | db_volume_dir_exists() |
Returns TRUE/FALSE
|
| Delete directory | Directory | db_volume_dir_delete() |
Use recursive = TRUE for non-empty directories |
| Upload directory | Bulk transfer | db_volume_upload_dir() |
Parallel upload, recursive = TRUE includes
subdirectories |
| Download directory | Bulk transfer | db_volume_download_dir() |
Parallel download, recursive = TRUE includes
subdirectories |
Example Workflows
Single File Round-Trip
If you just need to move one file in and out of a volume. This example goes beyond the minimum to showcase more than just upload/download.
library(brickster)
volume_root <- "/Volumes/zacdav/default/data"
incoming_dir <- file.path(volume_root, "incoming")
incoming_file <- file.path(incoming_dir, "example.csv")
# create local file
local_file <- tempfile(fileext = ".csv")
write.csv(mtcars, local_file, row.names = FALSE)
# ensure target directory exists
db_volume_dir_create(incoming_dir)
# upload file
db_volume_write(
path = incoming_file,
file = local_file,
overwrite = TRUE
)
# verify + inspect
db_volume_file_exists(incoming_file)
db_volume_list(incoming_dir)
# download file back to local path
downloaded_file <- tempfile(fileext = ".csv")
db_volume_read(
path = incoming_file,
destination = downloaded_file
)
# verify that file can be read as csv
read.csv(downloaded_file)
# clean up (optional)
db_volume_delete(incoming_file)
db_volume_dir_delete(incoming_dir)Bulk Directory Transfer (Upload + Download)
This is a compact pattern for a larger transfer: sample a local dataset to 100 million rows, write a 2-level partitioned Arrow dataset, upload it, then download it back.
library(brickster)
library(arrow)
library(dplyr)
volume_root <- "/Volumes/zacdav/default/data"
landing_dir <- file.path(volume_root, "sample_10m")
local_dir <- tempfile("arrow_sample_")
# sample to 10M rows
# write partitioned Arrow dataset (2 levels deep: cyl/gear)
mtcars |>
sample_n(size = 1e+07, replace = TRUE) |>
write_dataset(
path = local_dir,
format = "parquet",
partitioning = c("cyl", "gear")
)
# bulk upload
db_volume_upload_dir(
local_dir = local_dir,
volume_dir = landing_dir,
overwrite = TRUE,
recursive = TRUE
)
# bulk download
local_download <- tempfile("arrow_download_")
db_volume_download_dir(
volume_dir = landing_dir,
local_dir = local_download,
overwrite = TRUE,
recursive = TRUE
)
list.files(local_download, recursive = TRUE)
# cleanup example directory recursively (optional)
db_volume_dir_delete(
path = landing_dir,
recursive = TRUE
)Set recursive = FALSE for non-recursive transfer: only
files directly under the source directory are transferred, and nested
subdirectories are skipped.
Managing Volume Objects (Optional)
Use db_uc_volumes_* when you need to create or manage
the volume object itself (not files inside it).
| Operation | Function | Notes |
|---|---|---|
| List volumes in schema | db_uc_volumes_list() |
Returns volumes under
<catalog>.<schema>
|
| Get one volume | db_uc_volumes_get() |
Returns metadata for one volume |
| Create volume | db_uc_volumes_create() |
Supports MANAGED and EXTERNAL
|
| Update volume metadata | db_uc_volumes_update() |
Rename/comment/owner updates |
| Delete volume | db_uc_volumes_delete() |
Removes the Unity Catalog volume object |
# list volumes in a schema
db_uc_volumes_list(catalog = "<catalog>", schema = "<schema>")
# create a managed volume
db_uc_volumes_create(
catalog = "<catalog>",
schema = "<schema>",
volume = "my_volume",
volume_type = "MANAGED"
)
# inspect one volume
db_uc_volumes_get(
catalog = "<catalog>",
schema = "<schema>",
volume = "my_volume"
)After a volume exists, use
/Volumes/<catalog>/<schema>/<volume>/...
paths with db_volume_* for file operations.