databricks.labs.dqx.geo.check_funcs
is_latitude
@register_rule("row")
def is_latitude(column: str | Column) -> Column
Checks whether the values in the input column are valid latitudes.
Arguments:
column- column to check; can be a string column name or a column expression
Returns:
Column object indicating whether the values in the input column are valid latitudes
is_longitude
@register_rule("row")
def is_longitude(column: str | Column) -> Column
Checks whether the values in the input column are valid longitudes.
Arguments:
column- column to check; can be a string column name or a column expression
Returns:
Column object indicating whether the values in the input column are valid longitudes
is_geometry
@register_rule("row")
def is_geometry(column: str | Column) -> Column
Checks whether the values in the input column are valid geometries.
Arguments:
column- column to check; can be a string column name or a column expression
Returns:
Column object indicating whether the values in the input column are valid geometries
Notes:
This function requires Databricks serverless compute or runtime 17.1 or above.
is_geography
@register_rule("row")
def is_geography(column: str | Column) -> Column
Checks whether the values in the input column are valid geographies.
Arguments:
column- column to check; can be a string column name or a column expression
Returns:
Column object indicating whether the values in the input column are valid geographies
Notes:
This function requires Databricks serverless compute or runtime 17.1 or above.
is_point
@register_rule("row")
def is_point(column: str | Column) -> Column
Checks whether the values in the input column are point geometries.
Arguments:
column- column to check; can be a string column name or a column expression
Returns:
Column object indicating whether the values in the input column are point geometries
Notes:
This function requires Databricks serverless compute or runtime 17.1 or above.
is_linestring
@register_rule("row")
def is_linestring(column: str | Column) -> Column
Checks whether the values in the input column are linestring geometries.
Arguments:
column- column to check; can be a string column name or a column expression
Returns:
Column object indicating whether the values in the input column are linestring geometries
Notes:
This function requires Databricks serverless compute or runtime 17.1 or above.
is_polygon
@register_rule("row")
def is_polygon(column: str | Column) -> Column
Checks whether the values in the input column are polygon geometries.
Arguments:
column- column to check; can be a string column name or a column expression
Returns:
Column object indicating whether the values in the input column are polygon geometries
Notes:
This function requires Databricks serverless compute or runtime 17.1 or above.
is_multipoint
@register_rule("row")
def is_multipoint(column: str | Column) -> Column
Checks whether the values in the input column are multipoint geometries.
Arguments:
column- column to check; can be a string column name or a column expression
Returns:
Column object indicating whether the values in the input column are multipoint geometries
Notes:
This function requires Databricks serverless compute or runtime 17.1 or above.
is_multilinestring
@register_rule("row")
def is_multilinestring(column: str | Column) -> Column
Checks whether the values in the input column are multilinestring geometries.
Arguments:
column- column to check; can be a string column name or a column expression
Returns:
Column object indicating whether the values in the input column are multilinestring geometries
Notes:
This function requires Databricks serverless compute or runtime 17.1 or above.
is_multipolygon
@register_rule("row")
def is_multipolygon(column: str | Column) -> Column
Checks whether the values in the input column are multipolygon geometries.
Arguments:
column- column to check; can be a string column name or a column expression
Returns:
Column object indicating whether the values in the input column are multipolygon geometries
Notes:
This function requires Databricks serverless compute or runtime 17.1 or above.
is_geometrycollection
@register_rule("row")
def is_geometrycollection(column: str | Column) -> Column
Checks whether the values in the input column are geometrycollection geometries.
Arguments:
column- column to check; can be a string column name or a column expression
Returns:
Column object indicating whether the values in the input column are geometrycollection geometries
Notes:
This function requires Databricks serverless compute or runtime 17.1 or above.
is_ogc_valid
@register_rule("row")
def is_ogc_valid(column: str | Column) -> Column
Checks whether the values in the input column are valid geometries in the OGC sense.
Arguments:
column- column to check; can be a string column name or a column expression
Returns:
Column object indicating whether the values in the input column are valid geometries
Notes:
This function requires Databricks serverless compute or runtime 17.1 or above.
is_non_empty_geometry
@register_rule("row")
def is_non_empty_geometry(column: str | Column) -> Column
Checks whether the values in the input column are empty geometries.
Arguments:
column- column to check; can be a string column name or a column expression
Returns:
Column object indicating whether the values in the input column are empty geometries
Notes:
This function requires Databricks serverless compute or runtime 17.1 or above.
is_not_null_island
@register_rule("row")
def is_not_null_island(column: str | Column) -> Column
Checks whether the values in the input column are NULL island geometries (e.g. POINT(0 0), POINTZ(0 0 0), or POINTZM(0 0 0 0)).
Arguments:
column- column to check; can be a string column name or a column expression
Returns:
Column object indicating whether the values in the input column are NULL island geometries
Notes:
This function requires Databricks serverless compute or runtime 17.1 or above.
has_dimension
@register_rule("row")
def has_dimension(column: str | Column, dimension: int) -> Column
Checks whether the geometries/geographies in the input column have a given dimension.
Arguments:
column- column to check; can be a string column name or a column expressiondimension- required dimension of the geometries/geographies
Returns:
Column object indicating whether the geometries/geographies in the input column have a given dimension
Notes:
This function requires Databricks serverless compute or runtime 17.1 or above.
has_x_coordinate_between
@register_rule("row")
def has_x_coordinate_between(column: str | Column, min_value: float,
max_value: float) -> Column
Checks whether the x coordinates of the geometries in the input column are between a given range.
Arguments:
column- column to check; can be a string column name or a column expressionmin_value- minimum value of the x coordinatesmax_value- maximum value of the x coordinates
Returns:
Column object indicating whether the x coordinates of the geometries in the input column are between a given range
Notes:
This function requires Databricks serverless compute or runtime 17.1 or above.
has_y_coordinate_between
@register_rule("row")
def has_y_coordinate_between(column: str | Column, min_value: float,
max_value: float) -> Column
Checks whether the y coordinates of the geometries in the input column are between a given range.
Arguments:
column- column to check; can be a string column name or a column expressionmin_value- minimum value of the y coordinatesmax_value- maximum value of the y coordinates
Returns:
Column object indicating whether the y coordinates of the geometries in the input column are between a given range
Notes:
This function requires Databricks serverless compute or runtime 17.1 or above.
is_area_equal_to
@register_rule("row")
def is_area_equal_to(column: str | Column,
value: int | float | str | Column,
srid: int | None = 3857,
geodesic: bool = False) -> Column
Checks if the areas of values in a geometry or geography column are equal to a specified value. By default, the 2D Cartesian area in WGS84 (Pseudo-Mercator) with units of meters squared is used. An SRID can be specified to transform the input values and compute areas with specific units of measure.
Arguments:
column- Column to check; can be a string column name or a column expressionvalue- Value to use in the condition as number, column name or sql expressionsrid- Optional integer SRID to use for computing the area of the geometry or geography value (defaultNone). If an SRID is provided, the input value is translated and area is calculated using the units of measure of the specified coordinate reference system (e.g. meters squared forsrid=3857).geodesic- Whether to use the 2D geodesic area (defaultFalse).
Returns:
Column object indicating whether the area the geometries in the input column are equal to the provided value
Notes:
This function requires Databricks serverless compute or runtime 17.1 or above.
is_area_not_equal_to
@register_rule("row")
def is_area_not_equal_to(column: str | Column,
value: int | float | str | Column,
srid: int | None = 3857,
geodesic: bool = False) -> Column
Checks if the areas of values in a geometry column are not equal to a specified value. By default, the 2D Cartesian area in WGS84 (Pseudo-Mercator) with units of meters squared is used. An SRID can be specified to transform the input values and compute areas with specific units of measure.
Arguments:
column- Column to check; can be a string column name or a column expressionvalue- Value to use in the condition as number, column name or sql expressionsrid- Optional integer SRID to use for computing the area of the geometry or geography value (defaultNone). If an SRID is provided, the input value is translated and area is calculated using the units of measure of the specified coordinate reference system (e.g. meters squared forsrid=3857).geodesic- Whether to use the 2D geodesic area (defaultFalse).
Returns:
Column object indicating whether the area the geometries in the input column are not equal to the provided value
Notes:
This function requires Databricks serverless compute or runtime 17.1 or above.
is_area_not_greater_than
@register_rule("row")
def is_area_not_greater_than(column: str | Column,
value: int | float | str | Column,
srid: int | None = 3857,
geodesic: bool = False) -> Column
Checks if the areas of values in a geometry column are not greater than a specified limit. By default, the 2D Cartesian area in WGS84 (Pseudo-Mercator) with units of meters squared is used. An SRID can be specified to transform the input values and compute areas with specific units of measure.
Arguments:
column- Column to check; can be a string column name or a column expressionvalue- Value to use in the condition as number, column name or sql expressionsrid- Optional integer SRID to use for computing the area of the geometry or geography value (defaultNone). If an SRID is provided, the input value is translated and area is calculated using the units of measure of the specified coordinate reference system (e.g. meters squared forsrid=3857).geodesic- Whether to use the 2D geodesic area (defaultFalse).
Returns:
Column object indicating whether the area the geometries in the input column is greater than the provided value
Notes:
This function requires Databricks serverless compute or runtime 17.1 or above.
is_area_not_less_than
@register_rule("row")
def is_area_not_less_than(column: str | Column,
value: int | float | str | Column,
srid: int | None = 3857,
geodesic: bool = False) -> Column
Checks if the areas of values in a geometry column are not less than a specified limit. By default, the 2D Cartesian area in WGS84 (Pseudo-Mercator) with units of meters squared is used. An SRID can be specified to transform the input values and compute areas with specific units of measure.
Arguments:
column- Column to check; can be a string column name or a column expressionvalue- Value to use in the condition as number, column name or sql expressionsrid- Optional integer SRID to use for computing the area of the geometry or geography value (defaultNone). If an SRID is provided, the input value is translated and area is calculated using the units of measure of the specified coordinate reference system (e.g. meters squared forsrid=3857).geodesic- Whether to use the 2D geodesic area (defaultFalse).
Returns:
Column object indicating whether the area the geometries in the input column is less than the provided value
Notes:
This function requires Databricks serverless compute or runtime 17.1 or above.
is_num_points_equal_to
@register_rule("row")
def is_num_points_equal_to(column: str | Column,
value: int | float | str | Column) -> Column
Checks if the number of coordinate pairs in values of a geometry column is equal to a specified value.
Arguments:
column- Column to check; can be a string column name or a column expressionvalue- Value to use in the condition as number, column name or sql expression
Returns:
Column object indicating whether the number of coordinate pairs in the geometries of the input column is equal to the provided value
Notes:
This function requires Databricks serverless compute or runtime 17.1 or above.
is_num_points_not_equal_to
@register_rule("row")
def is_num_points_not_equal_to(column: str | Column,
value: int | float | str | Column) -> Column
Checks if the number of coordinate pairs in values of a geometry column is not equal to a specified value.
Arguments:
column- Column to check; can be a string column name or a column expressionvalue- Value to use in the condition as number, column name or sql expression
Returns:
Column object indicating whether the number of coordinate pairs in the geometries of the input column is not equal to the provided value
Notes:
This function requires Databricks serverless compute or runtime 17.1 or above.
is_num_points_not_greater_than
@register_rule("row")
def is_num_points_not_greater_than(
column: str | Column, value: int | float | str | Column) -> Column
Checks if the number of coordinate pairs in the values of a geometry column is not greater than a specified limit.
Arguments:
column- Column to check; can be a string column name or a column expressionvalue- Value to use in the condition as number, column name or sql expression
Returns:
Column object indicating whether the number of coordinate pairs in the geometries of the input column is greater than the provided value
Notes:
This function requires Databricks serverless compute or runtime 17.1 or above.
is_num_points_not_less_than
@register_rule("row")
def is_num_points_not_less_than(column: str | Column,
value: int | float | str | Column) -> Column
Checks if the number of coordinate pairs in values of a geometry column is not less than a specified limit.
Arguments:
column- Column to check; can be a string column name or a column expressionvalue- Value to use in the condition as number, column name or sql expression
Returns:
Column object indicating whether the number of coordinate pairs in the geometries of the input column is less than the provided value
Notes:
This function requires Databricks serverless compute or runtime 17.1 or above.