Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to load databricks package dbutils in pyspark

I was trying to run the below code in pyspark.

dbutils.widgets.text('config', '', 'config')

It was throwing me an error saying

 Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 NameError: name 'dbutils' is not defined

so, Is there any way I can run it in pyspark by including the databricks package ,like an import ?

Your help is appreciated

like image 758
Babu Avatar asked Aug 16 '18 21:08

Babu


2 Answers

as explained in https://docs.azuredatabricks.net/user-guide/dev-tools/db-connect.html#access-dbutils

depending on where you are executing your code directly on databricks server (eg. using databricks notebook to invoke your project egg file) or from your IDE using databricks-connect you should initialize dbutils as below. (where spark is your SparkSession)

def get_dbutils(spark):
    try:
        from pyspark.dbutils import DBUtils
        dbutils = DBUtils(spark)
    except ImportError:
        import IPython
        dbutils = IPython.get_ipython().user_ns["dbutils"]
    return dbutils

dbutils = get_dbutils(spark)
like image 81
pprasad009 Avatar answered Oct 15 '22 18:10

pprasad009


As of databricks runtime v3.0 the answer provided by pprasad009 above no longer works. Now use the following:

def get_db_utils(spark):

      dbutils = None
      
      if spark.conf.get("spark.databricks.service.client.enabled") == "true":
        
        from pyspark.dbutils import DBUtils
        dbutils = DBUtils(spark)
      
      else:
        
        import IPython
        dbutils = IPython.get_ipython().user_ns["dbutils"]
      
      return dbutils

See: https://docs.microsoft.com/en-gb/azure/databricks/dev-tools/databricks-connect#access-dbutils

like image 22
Chris Avatar answered Oct 15 '22 18:10

Chris