Skip to main content

Linter message codes

code compatibility problems

Here's the detailed explanation of the linter message codes:

cannot-autofix-table-reference

This indicates that the linter has found a table reference that cannot be automatically fixed. The user must manually update the table reference to point to the correct table in Unity Catalog. This mostly occurs, when table name is computed dynamically, and it's too complex for our static code analysis to detect it. We detect this problem anywhere where table name could be used: spark.sql, spark.catalog.*, spark.table, df.write.* and many more. Code examples that trigger this problem:

spark.table(f"foo_{some_table_name}")
# ..
df = spark.range(10)
df.write.saveAsTable(f"foo_{some_table_name}")
# .. or even
df.write.insertInto(f"foo_{some_table_name}")

Here the some_table_name variable is not defined anywhere in the visible scope. Though, the analyser would successfully detect table name if it is defined:

some_table_name = 'bar'
spark.table(f"foo_{some_table_name}")

We even detect string constants when coming either from dbutils.widgets.get (via job named parameters) or through loop variables. If old.things table is migrated to brand.new.stuff in Unity Catalog, the following code will trigger two messages: table-migrated-to-uc-{sql,python,python-sql} for the first query, as the contents are clearly analysable, and cannot-autofix-table-reference for the second query.

# ucx[table-migrated-to-uc-python-sql:+4:4:+4:20] Table old.things is migrated to brand.new.stuff in Unity Catalog
# ucx[cannot-autofix-table-reference:+3:4:+3:20] Can't migrate table_name argument in 'spark.sql(query)' because its value cannot be computed
table_name = f"table_{index}"
for query in ["SELECT * FROM old.things", f"SELECT * FROM {table_name}"]:
spark.sql(query).collect()

catalog-api-in-shared-clusters

spark.catalog.* functions require Databricks Runtime 14.3 LTS or above on Unity Catalog clusters in Shared access mode, so of your code has spark.catalog.tableExists("table") or spark.catalog.listDatabases(), you need to ensure that your cluster is running the correct runtime version and data security mode.

changed-result-format-in-uc

Calls to these functions would return a list of <catalog>.<database>.<table> instead of <database>.<table>. So if you have code like this:

for table in spark.catalog.listTables():
do_stuff_with_table(table)

you need to make sure that do_stuff_with_table can handle the new format.

direct-filesystem-access-in-sql-query

Direct filesystem access is deprecated in Unity Catalog. DBFS is no longer supported, so if you have code like this:

df = spark.sql("SELECT * FROM parquet.`/mnt/foo/path/to/parquet.file`")

you need to change it to use UC tables.

direct-filesystem-access

Direct filesystem access is deprecated in Unity Catalog. DBFS is no longer supported, so if you have code like this:

display(spark.read.csv('/mnt/things/data.csv'))

or this:

display(spark.read.csv('s3://bucket/folder/data.csv'))

You need to change it to use UC tables or UC volumes.

dependency-not-found

This message indicates that the linter has found a dependency, like Python source file or a notebook, that is not available in the workspace. The user must ensure that the dependency is available in the workspace. This usually means an error in the user code.

jvm-access-in-shared-clusters

You cannot access Spark Driver JVM on Unity Catalog clusters in Shared Access mode. If you have code like this:

spark._jspark._jvm.com.my.custom.Name()

or like this:

log4jLogger = sc._jvm.org.apache.log4j
LOGGER = log4jLogger.LogManager.getLogger(__name__)

you need to change it to use Python equivalents.

legacy-context-in-shared-clusters

SparkContext (sc) is not supported on Unity Catalog clusters in Shared access mode. Rewrite it using SparkSession (spark). Example code that triggers this message:

df = spark.createDataFrame(sc.emptyRDD(), schema)

or this:

sc.parallelize([1, 2, 3])

not-supported

Installing eggs is no longer supported on Databricks 14.0 or higher.

notebook-run-cannot-compute-value

Path for dbutils.notebook.run cannot be computed and requires adjusting the notebook path. It is not clear for automated code analysis where the notebook is located, so you need to simplify the code like:

b = some_function()
dbutils.notebook.run(b)

to something like this:

a = "./leaf1.py"
dbutils.notebook.run(a)

python-parse-error

This is a generic message indicating that the Python code could not be parsed. The user must manually check the Python code.

python-udf-in-shared-clusters

applyInPandas requires DBR 14.3 LTS or above on Unity Catalog clusters in Shared access mode. Example:

df.groupby("id").applyInPandas(subtract_mean, schema="id long, v double").show()

Arrow UDFs require DBR 14.3 LTS or above on Unity Catalog clusters in Shared access mode.

@udf(returnType='int', useArrow=True)
def arrow_slen(s):
return len(s)

It is not possible to register Java UDF from Python code on Unity Catalog clusters in Shared access mode. Use a %scala cell to register the Scala UDF using spark.udf.register. Example code that triggers this message:

spark.udf.registerJavaFunction("func", "org.example.func", IntegerType())

rdd-in-shared-clusters

RDD APIs are not supported on Unity Catalog clusters in Shared access mode. Use mapInArrow() or Pandas UDFs instead.

df.rdd.mapPartitions(myUdf)

spark-logging-in-shared-clusters

Cannot set Spark log level directly from code on Unity Catalog clusters in Shared access mode. Remove the call and set the cluster spark conf spark.log.level instead:

sc.setLogLevel("INFO")
setLogLevel("WARN")

Another example could be:

log4jLogger = sc._jvm.org.apache.log4j
LOGGER = log4jLogger.LogManager.getLogger(__name__)

or

sc._jvm.org.apache.log4j.LogManager.getLogger(__name__).info("test")

sql-parse-error

This is a generic message indicating that the SQL query could not be parsed. The user must manually check the SQL query.

sys-path-cannot-compute-value

Path for sys.path.append cannot be computed and requires adjusting the path. It is not clear for automated code analysis where the path is located.

table-migrated-to-uc-{sql,python,python-sql}

This message indicates that the linter has found a table that has been migrated to Unity Catalog. The user must ensure that the table is available in Unity Catalog.

PostfixExplanation
sqlTable reference in SparkSQL
pythonTable reference in PySpark
python-sqlTable reference in SparkSQL called from PySpark

to-json-in-shared-clusters

toJson() is not available on Unity Catalog clusters in Shared access mode. Use toSafeJson() on DBR 13.3 LTS or above to get a subset of command context information. Example code that triggers this message:

dbutils.notebook.entry_point.getDbutils().notebook().getContext().toSafeJson()

unsupported-magic-line

This message indicates the code that could not be analysed by UCX. User must check the code manually.