I was trying to run the below code in pyspark.
dbutils.widgets.text('config', '', 'config')
It was throwing me an error saying
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'dbutils' is not defined
so, Is there any way I can run it in pyspark by including the databricks package ,like an import ?
Your help is appreciated
as explained in https://docs.azuredatabricks.net/user-guide/dev-tools/db-connect.html#access-dbutils
depending on where you are executing your code directly on databricks server (eg. using databricks notebook to invoke your project egg file) or from your IDE using databricks-connect you should initialize dbutils as below. (where spark is your SparkSession)
def get_dbutils(spark):
try:
from pyspark.dbutils import DBUtils
dbutils = DBUtils(spark)
except ImportError:
import IPython
dbutils = IPython.get_ipython().user_ns["dbutils"]
return dbutils
dbutils = get_dbutils(spark)
As of databricks runtime v3.0 the answer provided by pprasad009 above no longer works. Now use the following:
def get_db_utils(spark):
dbutils = None
if spark.conf.get("spark.databricks.service.client.enabled") == "true":
from pyspark.dbutils import DBUtils
dbutils = DBUtils(spark)
else:
import IPython
dbutils = IPython.get_ipython().user_ns["dbutils"]
return dbutils
See: https://docs.microsoft.com/en-gb/azure/databricks/dev-tools/databricks-connect#access-dbutils
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With