I recently started working with spark and was eager to know if I have to perform queries which would be better spark sql or databricks sql and why?
We need to distinguish two things here:
As was mentioned in this answer, Databricks SQL as language is primarily based on Spark SQL with some additions specific to Delta Lake tables (like CREATE TABLE CLONE, ...). ANSI compatibility in Databricks SQL is controlled with ANSI_MODE setting, and will be enabled by default in the future.
But when it comes to the execution, Databricks SQL is different from Spark SQL engine because it uses Photon engine heavily optimized for modern hardware and BI/DW workloads. With Photon you can get significant speedup (2-3x) compared to standard Spark SQL engine on the complex queries that process a lot of data.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With