I am trying to find out the python version I am using in Databricks.
To find out I tried
import sys
print(sys.version)
And I got the output as 3.7.3
However when I went to Cluster --> SparkUI --> Environment
I see that the cluster Python version is 2.
Which version does this refer to ?
When I tried running
%sh python --version
I still get Python 3.7.3
Can there be a different python version for each worker / driver node ?
Note: I am using a setup where there is 1 worker node and 1 driver node (2 nodes in total with the same spec) and Databricks Runtime Version is 6.5 ML
This works in all notebooks either gooogle colab or MS Azure Databricks:
!python --version
Update: This issue has been fixed.
For new cluster: If you create a new cluster it will have python environment variable as 3.
For existing clusters: You need to add in Environment Variables tab in Cluster Configuration > Advanced, it changes in the Environmental variable.
PYSPARK_PYTHON=/databricks/python3/bin/python3

Thanks for bringing this to our attention. This is a product-bug, currently I'm working with the product team to fix the issue asap.
The default Python version for clusters created using the UI is Python 3.
As part of repro, I had created Databricks Runtime Version: 6.5 ML and observed the same behaviour.
Cluster --> SparkUI --> Environment shows incorrect version.


If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With