Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark SQL vs Databricks SQL

I recently started working with spark and was eager to know if I have to perform queries which would be better spark sql or databricks sql and why?

like image 729
khÜs h Avatar asked Mar 30 '26 03:03

khÜs h


1 Answers

We need to distinguish two things here:

  • Spark SQL as a dialect of the SQL language. Originally started as Shark & Hive on Spark projects (blog), it's now going close to ANSI SQL.
  • Spark SQL as execution engine inside Spark.

As was mentioned in this answer, Databricks SQL as language is primarily based on Spark SQL with some additions specific to Delta Lake tables (like CREATE TABLE CLONE, ...). ANSI compatibility in Databricks SQL is controlled with ANSI_MODE setting, and will be enabled by default in the future.

But when it comes to the execution, Databricks SQL is different from Spark SQL engine because it uses Photon engine heavily optimized for modern hardware and BI/DW workloads. With Photon you can get significant speedup (2-3x) compared to standard Spark SQL engine on the complex queries that process a lot of data.

like image 111
Alex Ott Avatar answered Apr 02 '26 07:04

Alex Ott



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!