dbldatagen.spark_singleton module

This file defines the SparkSingleton class

This is primarily meant for situations where the test data generator is run on a standalone environment for use cases like unit testing rather than in a Databricks workspace environment

class SparkSingleton[source]

Bases: object

A singleton class which returns one Spark session instance

classmethod getInstance()[source]

Create a Spark instance for Datalib.

Returns:

A Spark instance

classmethod getLocalInstance(appName='new Spark session', useAllCores=True)[source]

Create a machine local Spark instance for Datalib. By default, it uses n-1 cores of the available cores for the spark session, where n is total cores available.

Parameters:

useAllCores – If useAllCores is True, then use all cores rather than n-1 cores

Returns:

A Spark instance