dbldatagen.function_builder module
- class ColumnGeneratorBuilder[source]
Bases:
object
Helper class to build functional column generators of specific forms
- classmethod mkExprChoicesFn(values, weights, seed_column, datatype)[source]
Create SQL expression to compute the weighted values expression
build an expression of the form:
case when rnd_column <= weight1 then value1 when rnd_column <= weight2 then value2 ... when rnd_column <= weightN then valueN else valueN end
based on computed probability distribution for values.
In Python 3.6 onwards, we could use the choices function but this python version is not guaranteed on all Databricks distributions
- Parameters:
values – list of values
weights – list of weights
seed_column – base column for expression
datatype – data type of function return value