Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Version in Azure Databricks

I am trying to find out the python version I am using in Databricks.

To find out I tried

import sys
print(sys.version)

And I got the output as 3.7.3

However when I went to Cluster --> SparkUI --> Environment

I see that the cluster Python version is 2.

Which version does this refer to ?

When I tried running

%sh python --version

I still get Python 3.7.3

Can there be a different python version for each worker / driver node ?

Note: I am using a setup where there is 1 worker node and 1 driver node (2 nodes in total with the same spec) and Databricks Runtime Version is 6.5 ML

like image 776
learner Avatar asked Jan 23 '26 13:01

learner


2 Answers

This works in all notebooks either gooogle colab or MS Azure Databricks:

!python --version
like image 130
Mario Avatar answered Jan 26 '26 02:01

Mario


Update: This issue has been fixed.

For new cluster: If you create a new cluster it will have python environment variable as 3.

For existing clusters: You need to add in Environment Variables tab in Cluster Configuration > Advanced, it changes in the Environmental variable.

PYSPARK_PYTHON=/databricks/python3/bin/python3

enter image description here


Thanks for bringing this to our attention. This is a product-bug, currently I'm working with the product team to fix the issue asap.

The default Python version for clusters created using the UI is Python 3.

As part of repro, I had created Databricks Runtime Version: 6.5 ML and observed the same behaviour.

Cluster --> SparkUI --> Environment shows incorrect version.

enter image description here

enter image description here

like image 28
CHEEKATLAPRADEEP-MSFT Avatar answered Jan 26 '26 02:01

CHEEKATLAPRADEEP-MSFT



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!