In my experience, even though there is a SQL standard, it is quite difficult to write SQL that works, unmodified, over a large number of RDBMS.
Thus, I'd like to know if there is a subset of SQL (including DDL, schemas, etc) that is known to work on all major RDBMS, including PostgreSQL, MySQL, SQL Server and, last but not least, Oracle. What kind of pitfalls should be avoided when writing portable SQL?
By the way, is there a project whose goal is to translate a valid subset of SQL into the specific dialects used by all these vendors? I know that Hibernate and other ORM systems have to do this, but I don't want ORM, I want to write straight-to-database SQL.
Thanks!
Data portability refers to the ability to move, copy or transfer data easily from one database, storage or IT environment to another. Portability describes the extent to which the data can easily be ported between different computers and operational environments.
You can't execute SQL query without SQL database.
Some of these alternative SQL query languages include: SchemeQL, CLSQL, ScalaQL and ScalaQuery for the Scheme and Scala dialects of Lisp, respectively. SQLStatement and ActiveRecord for Ruby. HaskellDB for Haskell.
The problem is that some DBMS even ignore the most simple standards (e.g. like quoting characters or string concatenation).
So the following (100% ANSI SQL) does not run on every DBMS:
UPDATE some_table
SET some_column = some_column || '_more_data';
And I'M not even thinking about more advanced SQL standards like recursive common table expressions (even those that support it don't always comply) or windowing functions (some only implement a very narrow subset, some do not support all options).
Regarding DDL, there is the problem with data types. DATE
is not the same everywhere, just as TIMESTAMP
. Not every DBMS has a BOOLEAN
type or TIME
type.
When it comes to constraints or domains you get even more differences.
So in a nutshell: unless you really, really need to be DBMS independent, don't bother with it.
Having said all that: if you do have the choice between a proprietary and standard syntax do choose the standard syntax (OUTER JOIN
vs (+)
or *=
, decode
vs CASE
, nvl
vs. coalesce
and so on).
Within each RDBMS, whatever is listed as ANSI-compliant should be the same across all of them as that is the true standard. However, by sticking with only ANSI (i.e. portable) stuff, then you lose out on the optimized, vendor-specific functionality. Also, just because PostgreSQL implements an ANSI function doesn't mean that it is available in any other RDBMS (but if it is available, then it should work the same).
Personally, I see no value in truly portable SQL code or a project to normalize down to a lowest-common-denominator set as each particular RDBMS is optimized differently. There is no common application language. IF you are using C#, then you wouldn't be wanting to use stuff that can only be found in PHP or JAVA. So just embrace the platform you are on :).
Edit: If you are writing an application that can connect to several different RDBMS's, then you will likely need to find the appropriate SQL for each particular platform, just like the authors of each of the ORM's had to do.
Simple queries are almost always portable. Unfortunately, the list of SQL vendors that you provided vary greatly in their standards compliance. MS SQL Server is at the top of the ones that you listed in terms of complying with ANSI SQL standards, and both MySQL and Oracle are notoriously bad when it comes to standards compliance. That, of course, is not to say that they're bad RDBMS engines or that you can't write powerful queries with them, but their adherence to standards is not exactly what they're known for.
Note that you've omitted some big RDBMS players in that list, namely Sybase and IBM's DB2. Those two are generally more standards-compliant than the others, for what that's worth.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With