There are definitions available for what is ABFS[S] and WASB[S]. But no clear demarcation of when to use what. What are the suitable and most appropriate use cases for both?
ABFS stands for Azure Blob File System and Microsoft recommends it for big data workloads as it is optimized for it as mentioned here.
WASBS stands for Windows Azure Storage Blob and Microsoft recommends it as is provides TLS encrypted access as mentioned here.
The difference and use case are as below:
ABFS[S] is used for Azure Data Lake Storage Gen2 which is based on normal Azure storage(during creating Azure storage account, enable Hierarchical namespace, then you create a Azure Data Lake Storage Gen2). An example is here.
WASB[S] is used for the normal Azure storage. An example is here.
ADLS Gen2 supports both ABFSS and WASBS. Key difference is that:
WASBS is the classical blob storage API for accessing data whereas ABFSS is hadoop access compatible and highly efficient . Solutions like hortonworks, HDInsight, azure databricks can connect very easily using the ABFSS driver.
A python application to read and write files may use WASBS. And any big data tools like databricks, dremio, HDInsight can use ABFSS.
Also, you will notice some of the tools like powerBI supports both wasbs and abfss.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With