dbldatagen.constraints.constraint module
This module defines the Constraint class
- class Constraint(supportsStreaming=False)[source]
Bases:
ABC
Constraint object - base class for predefined and custom constraints
This class is meant for internal use only.
- SUPPORTED_OPERATORS = ['<', '>', '>=', '!=', '==', '=', '<=', '<>']
- property filterExpression
Return the filter expression (as instance of type Column that evaluates to True or non-True)
- static mkCombinedConstraintExpression(constraintExpressions)[source]
Generate a SQL expression that combines multiple constraints using AND
- Parameters:
constraintExpressions – list of Pyspark SQL Column constraint expression objects
- Returns:
combined constraint expression as Pyspark SQL Column object (or None if no valid expressions)
- abstract prepareDataGenerator(dataGenerator)[source]
Prepare the data generator to generate data that matches the constraint
This method may modify the data generation rules to meet the constraint
- Parameters:
dataGenerator – Data generation object that will generate the dataframe
- Returns:
modified or unmodified data generator
- property supportsStreaming
Return True if the constraint supports streaming dataframes
- abstract transformDataframe(dataGenerator, dataFrame)[source]
Transform the dataframe to make data conform to constraint if possible
This method should not modify the dataGenerator - but may modify the dataframe
- Parameters:
dataGenerator – Data generation object that generated the dataframe
dataFrame – generated dataframe
- Returns:
modified or unmodified Spark dataframe
The default transformation returns the dataframe unmodified
- class NoFilterMixin[source]
Bases:
object
Mixin class to indicate that constraint has no filter expression
Intended to be used in implementation of the concrete constraint classes.
Use of the mixin class is optional but when used with the Constraint class and multiple inheritance, it will provide a default implementation of the _generateFilterExpression method that satisfies the abstract method requirement of the Constraint class.
When using mixins, place the mixin class first in the list of base classes.
- class NoPrepareTransformMixin[source]
Bases:
object
Mixin class to indicate that constraint has no filter expression
Intended to be used in implementation of the concrete constraint classes.
Use of the mixin class is optional but when used with the Constraint class and multiple inheritance, it will provide a default implementation of the prepareDataGenerator and transformeDataFrame methods that satisfies the abstract method requirements of the Constraint class.
When using mixins, place the mixin class first in the list of base classes.
- prepareDataGenerator(dataGenerator)[source]
Prepare the data generator to generate data that matches the constraint
This method may modify the data generation rules to meet the constraint
- Parameters:
dataGenerator – Data generation object that will generate the dataframe
- Returns:
modified or unmodified data generator
- transformDataframe(dataGenerator, dataFrame)[source]
Transform the dataframe to make data conform to constraint if possible
This method should not modify the dataGenerator - but may modify the dataframe
- Parameters:
dataGenerator – Data generation object that generated the dataframe
dataFrame – generated dataframe
- Returns:
modified or unmodified Spark dataframe
The default transformation returns the dataframe unmodified