Skip to main content

databricks.labs.dqx.geo.check_funcs

is_latitude

@register_rule("row")
def is_latitude(column: str | Column) -> Column

Checks whether the values in the input column are valid latitudes.

Arguments:

  • column - column to check; can be a string column name or a column expression

Returns:

Column object indicating whether the values in the input column are valid latitudes

is_longitude

@register_rule("row")
def is_longitude(column: str | Column) -> Column

Checks whether the values in the input column are valid longitudes.

Arguments:

  • column - column to check; can be a string column name or a column expression

Returns:

Column object indicating whether the values in the input column are valid longitudes

is_geometry

@register_rule("row")
def is_geometry(column: str | Column) -> Column

Checks whether the values in the input column are valid geometries.

Arguments:

  • column - column to check; can be a string column name or a column expression

Returns:

Column object indicating whether the values in the input column are valid geometries

Notes:

This function requires Databricks serverless compute or runtime 17.1 or above.

is_geography

@register_rule("row")
def is_geography(column: str | Column) -> Column

Checks whether the values in the input column are valid geographies.

Arguments:

  • column - column to check; can be a string column name or a column expression

Returns:

Column object indicating whether the values in the input column are valid geographies

Notes:

This function requires Databricks serverless compute or runtime 17.1 or above.

is_point

@register_rule("row")
def is_point(column: str | Column) -> Column

Checks whether the values in the input column are point geometries.

Arguments:

  • column - column to check; can be a string column name or a column expression

Returns:

Column object indicating whether the values in the input column are point geometries

Notes:

This function requires Databricks serverless compute or runtime 17.1 or above.

is_linestring

@register_rule("row")
def is_linestring(column: str | Column) -> Column

Checks whether the values in the input column are linestring geometries.

Arguments:

  • column - column to check; can be a string column name or a column expression

Returns:

Column object indicating whether the values in the input column are linestring geometries

Notes:

This function requires Databricks serverless compute or runtime 17.1 or above.

is_polygon

@register_rule("row")
def is_polygon(column: str | Column) -> Column

Checks whether the values in the input column are polygon geometries.

Arguments:

  • column - column to check; can be a string column name or a column expression

Returns:

Column object indicating whether the values in the input column are polygon geometries

Notes:

This function requires Databricks serverless compute or runtime 17.1 or above.

is_multipoint

@register_rule("row")
def is_multipoint(column: str | Column) -> Column

Checks whether the values in the input column are multipoint geometries.

Arguments:

  • column - column to check; can be a string column name or a column expression

Returns:

Column object indicating whether the values in the input column are multipoint geometries

Notes:

This function requires Databricks serverless compute or runtime 17.1 or above.

is_multilinestring

@register_rule("row")
def is_multilinestring(column: str | Column) -> Column

Checks whether the values in the input column are multilinestring geometries.

Arguments:

  • column - column to check; can be a string column name or a column expression

Returns:

Column object indicating whether the values in the input column are multilinestring geometries

Notes:

This function requires Databricks serverless compute or runtime 17.1 or above.

is_multipolygon

@register_rule("row")
def is_multipolygon(column: str | Column) -> Column

Checks whether the values in the input column are multipolygon geometries.

Arguments:

  • column - column to check; can be a string column name or a column expression

Returns:

Column object indicating whether the values in the input column are multipolygon geometries

Notes:

This function requires Databricks serverless compute or runtime 17.1 or above.

is_geometrycollection

@register_rule("row")
def is_geometrycollection(column: str | Column) -> Column

Checks whether the values in the input column are geometrycollection geometries.

Arguments:

  • column - column to check; can be a string column name or a column expression

Returns:

Column object indicating whether the values in the input column are geometrycollection geometries

Notes:

This function requires Databricks serverless compute or runtime 17.1 or above.

is_ogc_valid

@register_rule("row")
def is_ogc_valid(column: str | Column) -> Column

Checks whether the values in the input column are valid geometries in the OGC sense.

Arguments:

  • column - column to check; can be a string column name or a column expression

Returns:

Column object indicating whether the values in the input column are valid geometries

Notes:

This function requires Databricks serverless compute or runtime 17.1 or above.

is_non_empty_geometry

@register_rule("row")
def is_non_empty_geometry(column: str | Column) -> Column

Checks whether the values in the input column are empty geometries.

Arguments:

  • column - column to check; can be a string column name or a column expression

Returns:

Column object indicating whether the values in the input column are empty geometries

Notes:

This function requires Databricks serverless compute or runtime 17.1 or above.

is_not_null_island

@register_rule("row")
def is_not_null_island(column: str | Column) -> Column

Checks whether the values in the input column are NULL island geometries (e.g. POINT(0 0), POINTZ(0 0 0), or POINTZM(0 0 0 0)).

Arguments:

  • column - column to check; can be a string column name or a column expression

Returns:

Column object indicating whether the values in the input column are NULL island geometries

Notes:

This function requires Databricks serverless compute or runtime 17.1 or above.

has_dimension

@register_rule("row")
def has_dimension(column: str | Column, dimension: int) -> Column

Checks whether the geometries/geographies in the input column have a given dimension.

Arguments:

  • column - column to check; can be a string column name or a column expression
  • dimension - required dimension of the geometries/geographies

Returns:

Column object indicating whether the geometries/geographies in the input column have a given dimension

Notes:

This function requires Databricks serverless compute or runtime 17.1 or above.

has_x_coordinate_between

@register_rule("row")
def has_x_coordinate_between(column: str | Column, min_value: float,
max_value: float) -> Column

Checks whether the x coordinates of the geometries in the input column are between a given range.

Arguments:

  • column - column to check; can be a string column name or a column expression
  • min_value - minimum value of the x coordinates
  • max_value - maximum value of the x coordinates

Returns:

Column object indicating whether the x coordinates of the geometries in the input column are between a given range

Notes:

This function requires Databricks serverless compute or runtime 17.1 or above.

has_y_coordinate_between

@register_rule("row")
def has_y_coordinate_between(column: str | Column, min_value: float,
max_value: float) -> Column

Checks whether the y coordinates of the geometries in the input column are between a given range.

Arguments:

  • column - column to check; can be a string column name or a column expression
  • min_value - minimum value of the y coordinates
  • max_value - maximum value of the y coordinates

Returns:

Column object indicating whether the y coordinates of the geometries in the input column are between a given range

Notes:

This function requires Databricks serverless compute or runtime 17.1 or above.

is_area_equal_to

@register_rule("row")
def is_area_equal_to(column: str | Column,
value: int | float | str | Column,
srid: int | None = 3857,
geodesic: bool = False) -> Column

Checks if the areas of values in a geometry or geography column are equal to a specified value. By default, the 2D Cartesian area in WGS84 (Pseudo-Mercator) with units of meters squared is used. An SRID can be specified to transform the input values and compute areas with specific units of measure.

Arguments:

  • column - Column to check; can be a string column name or a column expression
  • value - Value to use in the condition as number, column name or sql expression
  • srid - Optional integer SRID to use for computing the area of the geometry or geography value (default None). If an SRID is provided, the input value is translated and area is calculated using the units of measure of the specified coordinate reference system (e.g. meters squared for srid=3857).
  • geodesic - Whether to use the 2D geodesic area (default False).

Returns:

Column object indicating whether the area the geometries in the input column are equal to the provided value

Notes:

This function requires Databricks serverless compute or runtime 17.1 or above.

is_area_not_equal_to

@register_rule("row")
def is_area_not_equal_to(column: str | Column,
value: int | float | str | Column,
srid: int | None = 3857,
geodesic: bool = False) -> Column

Checks if the areas of values in a geometry column are not equal to a specified value. By default, the 2D Cartesian area in WGS84 (Pseudo-Mercator) with units of meters squared is used. An SRID can be specified to transform the input values and compute areas with specific units of measure.

Arguments:

  • column - Column to check; can be a string column name or a column expression
  • value - Value to use in the condition as number, column name or sql expression
  • srid - Optional integer SRID to use for computing the area of the geometry or geography value (default None). If an SRID is provided, the input value is translated and area is calculated using the units of measure of the specified coordinate reference system (e.g. meters squared for srid=3857).
  • geodesic - Whether to use the 2D geodesic area (default False).

Returns:

Column object indicating whether the area the geometries in the input column are not equal to the provided value

Notes:

This function requires Databricks serverless compute or runtime 17.1 or above.

is_area_not_greater_than

@register_rule("row")
def is_area_not_greater_than(column: str | Column,
value: int | float | str | Column,
srid: int | None = 3857,
geodesic: bool = False) -> Column

Checks if the areas of values in a geometry column are not greater than a specified limit. By default, the 2D Cartesian area in WGS84 (Pseudo-Mercator) with units of meters squared is used. An SRID can be specified to transform the input values and compute areas with specific units of measure.

Arguments:

  • column - Column to check; can be a string column name or a column expression
  • value - Value to use in the condition as number, column name or sql expression
  • srid - Optional integer SRID to use for computing the area of the geometry or geography value (default None). If an SRID is provided, the input value is translated and area is calculated using the units of measure of the specified coordinate reference system (e.g. meters squared for srid=3857).
  • geodesic - Whether to use the 2D geodesic area (default False).

Returns:

Column object indicating whether the area the geometries in the input column is greater than the provided value

Notes:

This function requires Databricks serverless compute or runtime 17.1 or above.

is_area_not_less_than

@register_rule("row")
def is_area_not_less_than(column: str | Column,
value: int | float | str | Column,
srid: int | None = 3857,
geodesic: bool = False) -> Column

Checks if the areas of values in a geometry column are not less than a specified limit. By default, the 2D Cartesian area in WGS84 (Pseudo-Mercator) with units of meters squared is used. An SRID can be specified to transform the input values and compute areas with specific units of measure.

Arguments:

  • column - Column to check; can be a string column name or a column expression
  • value - Value to use in the condition as number, column name or sql expression
  • srid - Optional integer SRID to use for computing the area of the geometry or geography value (default None). If an SRID is provided, the input value is translated and area is calculated using the units of measure of the specified coordinate reference system (e.g. meters squared for srid=3857).
  • geodesic - Whether to use the 2D geodesic area (default False).

Returns:

Column object indicating whether the area the geometries in the input column is less than the provided value

Notes:

This function requires Databricks serverless compute or runtime 17.1 or above.

is_num_points_equal_to

@register_rule("row")
def is_num_points_equal_to(column: str | Column,
value: int | float | str | Column) -> Column

Checks if the number of coordinate pairs in values of a geometry column is equal to a specified value.

Arguments:

  • column - Column to check; can be a string column name or a column expression
  • value - Value to use in the condition as number, column name or sql expression

Returns:

Column object indicating whether the number of coordinate pairs in the geometries of the input column is equal to the provided value

Notes:

This function requires Databricks serverless compute or runtime 17.1 or above.

is_num_points_not_equal_to

@register_rule("row")
def is_num_points_not_equal_to(column: str | Column,
value: int | float | str | Column) -> Column

Checks if the number of coordinate pairs in values of a geometry column is not equal to a specified value.

Arguments:

  • column - Column to check; can be a string column name or a column expression
  • value - Value to use in the condition as number, column name or sql expression

Returns:

Column object indicating whether the number of coordinate pairs in the geometries of the input column is not equal to the provided value

Notes:

This function requires Databricks serverless compute or runtime 17.1 or above.

is_num_points_not_greater_than

@register_rule("row")
def is_num_points_not_greater_than(
column: str | Column, value: int | float | str | Column) -> Column

Checks if the number of coordinate pairs in the values of a geometry column is not greater than a specified limit.

Arguments:

  • column - Column to check; can be a string column name or a column expression
  • value - Value to use in the condition as number, column name or sql expression

Returns:

Column object indicating whether the number of coordinate pairs in the geometries of the input column is greater than the provided value

Notes:

This function requires Databricks serverless compute or runtime 17.1 or above.

is_num_points_not_less_than

@register_rule("row")
def is_num_points_not_less_than(column: str | Column,
value: int | float | str | Column) -> Column

Checks if the number of coordinate pairs in values of a geometry column is not less than a specified limit.

Arguments:

  • column - Column to check; can be a string column name or a column expression
  • value - Value to use in the condition as number, column name or sql expression

Returns:

Column object indicating whether the number of coordinate pairs in the geometries of the input column is less than the provided value

Notes:

This function requires Databricks serverless compute or runtime 17.1 or above.