Kepler visualizations
You can use the %%mosaic_kepler
magic function to visualise data using Kepler.gl.
The mosaic_kepler magic function accepts four parameters:
dataset
: Can be a Spark dataset or a string representing a table/view namecolumn_name
: The column that needs to be plotted, can be either a geometry column (WKT
,WKB
or Mosaic internal format) or a column containing a spatial grid index IDfeature_type
: The type of data to be plotted. Valid values aregeometry
(if SRID=4326),geometry(<SRID>)
(where<SRID>
is the SRID used by the geometry column),geometry(bng)
,geometry(osgb36)
,bng
andh3
limit
: The maximum number of objects to plot. The default limit is1000
Usage:
%%mosaic_kepler
dataset column_name feature_type [limit]
This magic function is only available in python. It can be used from notebooks with other default languages by storing the intermediate result in a temporary view, and then adding a python cell that uses the mosaic_kepler
with the temporary view created from another language.
Examples
[ ]:
%pip install databricks-mosaic --quiet
[ ]:
from pyspark.sql.functions import *
import mosaic as mos
mos.enable_mosaic(spark, dbutils)
Download example shapes
[ ]:
import requests
req = requests.get('https://data.cityofnewyork.us/api/geospatial/d3c5-ddgc?method=export&format=GeoJSON')
with open('/dbfs/tmp/nyc_taxi_zones.geojson', 'wb') as f:
f.write(req.content)
[ ]:
neighbourhoods = (
spark.read
.option("multiline", "true")
.format("json")
.load("dbfs:/tmp/nyc_taxi_zones.geojson")
# Extract geoJSON values for shapes
.select("type", explode(col("features")).alias("feature"))
.select("type", col("feature.properties").alias("properties"), to_json(col("feature.geometry")).alias("geom_json"))
# Mosaic internal representation
.withColumn("geom_internal", mos.st_geomfromgeojson("geom_json"))
# WKT representation
.withColumn("geom_wkt", mos.st_aswkt(col("geom_internal")))
# WKB representation
.withColumn("geom_wkb", mos.st_aswkb(col("geom_internal")))
# Limit to only 1 shape
.limit(1)
)
neighbourhoods.show()
Plot geometries from Spark dataset
Internal geometry type
[ ]:
%%mosaic_kepler
neighbourhoods "geom_internal" "geometry"
WKT geometry type
[ ]:
%%mosaic_kepler
neighbourhoods "geom_wkt" "geometry"
WKB geometry type
[ ]:
%%mosaic_kepler
neighbourhoods "geom_wkb" "geometry"
Plot geometries from table/view
[ ]:
neighbourhoods.createOrReplaceTempView("temp_view_neighbourhoods")
[ ]:
%%mosaic_kepler
"temp_view_neighbourhoods" "geom_wkt" "geometry"
Plot H3 indexes
[ ]:
neighbourhood_chips = (neighbourhoods
.limit(1)
.select(mos.grid_tessellateexplode("geom_internal", lit(9)))
.select("index.*")
)
neighbourhood_chips.show()
[ ]:
%%mosaic_kepler
neighbourhood_chips "index_id" "h3"
Plot H3 chips
[ ]:
%%mosaic_kepler
neighbourhood_chips "wkb" "geometry"