Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Databricks : difference between mount and direct access of Data Lake Storage Gen 2

Tags:

What is the difference between mounting an Azure Data Lake Store Gen2 on Databricks using Service pricipal and Direct Access using SAS key ?

I want to know the difference in term of data transfer, security of access

Thanks

like image 571
I.Chorfi Avatar asked May 15 '19 10:05

I.Chorfi


1 Answers

If you mount storage all users on all clusters get access.

If you do not mount and connect directly in the session using either a service principal or a SAS (I don't think a SAS key is officially supported BTW) the user in that session must have access to the credentials to create the connection.

Service Principals can also have low lever permissions applied within the lake, such as restricting to certain folders.

Note that with ADLS Gen2 you now also have the option of passing through the user credentials: https://docs.azuredatabricks.net/spark/latest/data-sources/azure/adls-passthrough.html

I do not know of any performance differences.

like image 68
simon_dmorias Avatar answered Nov 14 '22 13:11

simon_dmorias