Create new cluster.
create_cluster.Rd
Creates a new Spark cluster. This method will acquire new instances from the cloud provider if necessary. Note: Databricks may not be able to acquire some of the requested nodes, due to cloud provider limitations (account limits, spot price, etc.) or transient network issues.
Usage
create_cluster(
client,
spark_version,
apply_policy_default_values = NULL,
autoscale = NULL,
autotermination_minutes = NULL,
aws_attributes = NULL,
azure_attributes = NULL,
cluster_log_conf = NULL,
cluster_name = NULL,
cluster_source = NULL,
custom_tags = NULL,
data_security_mode = NULL,
docker_image = NULL,
driver_instance_pool_id = NULL,
driver_node_type_id = NULL,
enable_elastic_disk = NULL,
enable_local_disk_encryption = NULL,
gcp_attributes = NULL,
init_scripts = NULL,
instance_pool_id = NULL,
node_type_id = NULL,
num_workers = NULL,
policy_id = NULL,
runtime_engine = NULL,
single_user_name = NULL,
spark_conf = NULL,
spark_env_vars = NULL,
ssh_public_keys = NULL,
workload_type = NULL
)
clustersCreate(
client,
spark_version,
apply_policy_default_values = NULL,
autoscale = NULL,
autotermination_minutes = NULL,
aws_attributes = NULL,
azure_attributes = NULL,
cluster_log_conf = NULL,
cluster_name = NULL,
cluster_source = NULL,
custom_tags = NULL,
data_security_mode = NULL,
docker_image = NULL,
driver_instance_pool_id = NULL,
driver_node_type_id = NULL,
enable_elastic_disk = NULL,
enable_local_disk_encryption = NULL,
gcp_attributes = NULL,
init_scripts = NULL,
instance_pool_id = NULL,
node_type_id = NULL,
num_workers = NULL,
policy_id = NULL,
runtime_engine = NULL,
single_user_name = NULL,
spark_conf = NULL,
spark_env_vars = NULL,
ssh_public_keys = NULL,
workload_type = NULL
)
Arguments
- client
Required. Instance of DatabricksClient()
- spark_version
Required. The Spark version of the cluster, e.g.
- apply_policy_default_values
This field has no description yet.
- autoscale
Parameters needed in order to automatically scale clusters up and down based on load.
- autotermination_minutes
Automatically terminates the cluster after it is inactive for this time in minutes.
- aws_attributes
Attributes related to clusters running on Amazon Web Services.
- azure_attributes
Attributes related to clusters running on Microsoft Azure.
- cluster_log_conf
The configuration for delivering spark logs to a long-term storage destination.
- cluster_name
Cluster name requested by the user.
- cluster_source
Determines whether the cluster was created by a user through the UI, created by the Databricks Jobs Scheduler, or through an API request.
- custom_tags
Additional tags for cluster resources.
- data_security_mode
Data security mode decides what data governance model to use when accessing data from a cluster.
- docker_image
This field has no description yet.
- driver_instance_pool_id
The optional ID of the instance pool for the driver of the cluster belongs.
- driver_node_type_id
The node type of the Spark driver.
- enable_elastic_disk
Autoscaling Local Storage: when enabled, this cluster will dynamically acquire additional disk space when its Spark workers are running low on disk space.
- enable_local_disk_encryption
Whether to enable LUKS on cluster VMs' local disks.
- gcp_attributes
Attributes related to clusters running on Google Cloud Platform.
- init_scripts
The configuration for storing init scripts.
- instance_pool_id
The optional ID of the instance pool to which the cluster belongs.
- node_type_id
This field encodes, through a single value, the resources available to each of the Spark nodes in this cluster.
- num_workers
Number of worker nodes that this cluster should have.
- policy_id
The ID of the cluster policy used to create the cluster if applicable.
- runtime_engine
Decides which runtime engine to be use, e.g.
- single_user_name
Single user name if data_security_mode is
SINGLE_USER
.- spark_conf
An object containing a set of optional, user-specified Spark configuration key-value pairs.
- spark_env_vars
An object containing a set of optional, user-specified environment variable key-value pairs.
- ssh_public_keys
SSH public key contents that will be added to each Spark node in this cluster.
- workload_type
This field has no description yet.