I'm getting increasingly frustrated with the limitations and verbosity required to actually commit some business logic to stored procedures, using languages such as Transact-SQL or PL/SQL. I would love to convert some current databases to Oracle and take advantage of its support for Java stored procedures, but that option is not available at the moment.
What alternatives would you recommend in the way of databases that support stored procedures in other languages?
You can reduce the use of stored procedure by using parameterized queries or Object Relational Mapping, but still the your code stays on the server side and multiple database calls could be required.
Python and SQL are a match made in heaven, so we've given them the honor of first place on our list. Python is one of the world's most popular scripting languages, one of the easiest to learn, and one of the best for data analysis and visualization.
Stored procedures have been falling out of favour with some organizations for several years now. The preferred approach of these businesses for accessing their database(s) is to employ an Object-relational Mapper (ORM) such as NHibernate or Entity Framework.
Stored procedures are faster as compared to LINQ query since they have a predictable execution plan and can take the full advantage of SQL features. Hence, when a stored procedure is being executed next time, the database used the cached execution plan to execute that stored procedure.
There are some architectural obstacles to having more clever query languages in a database manager. The principal one is the query optimiser. One of the design constraints on SQL is that it can only use constructs that are accessible to the query optimiser. This means that the language and its capabilities are quite tightly coupled to the capabilities of the query execution engine and query plan optimiser.
The other major design constraint is the mechanical nature of the database system - database programming is almost unique in that it has a mechanical component. Query performance is limited by the mechanical constraints of disk head seeks and rotational latency (the wait time before the data you want arrives under the heads).
This effectively precludes many clever abstractions that might make SQL more powerful or easier to work with. Many database management systems supplement SQL with procedural alternatives that can be used for scripting. However, they interact with the DBMS by executing a series of SQL queries that are processed by the optimiser individually. Some languages of this type that are shipped with various DBMS platforms are:
Oracle's PL/SQL and embedded Java. PL/SQL is actually based on Ada - it is quite 'old school' by modern standards and has a legacy code base with which it must remain backwardly compatible. It's not necessarily the most pleasant programming environment but it does have constructs for facilities such as parallelism and a reasonably flexible type system. One of the major criticisms of Java stored procedures on Oracle is that you are paying for Oracle's capacity based licensing on the CPU you are running the JVM's on.
SQL Server CLR Integration. Somewhat similar to Oracle's Java Stored Procedures, this allows CLR modules compiled from C# (or any .net language) to be loaded into a SQL Server instance and executed in much the same way as stored procedures. SQL Server also has PostgreSQL-style API's for making custom aggregate functions through CLR integration and other hooks for mixed SQL/CLR code bases.
PostgreSQL is actually the
system where back-end language
integration was originally
developed. The system exports a native C API with facilities for
custom aggregate functions, storage
engines, procedural extensions and other
functionality.
The language interfaces
are based on this API and include:
PL/pgSQL (a bespoke language similar
to PL/SQL), Python, Perl
and Tcl.
This made it into the
mainstream through Illustra, a
commercialised version of Postgres,
which was then bought out by
Informix (which was subsequently
bought out by IBM). The key
features were incorporated into
Informix On-Line, which is
still sold by IBM.
One key limitation of these languages is their limited interaction with the query optimiser (although the C API for PostgreSQL does have support for this). Participation in a query plan as first-class citizen requires that the query optimiser can work out a sensible view of the resources your action will take. In practice, this type of interaction with the query optimiser is mainly useful for implementing storage engines.
This level of digging into the storage engine is (a) somewhat esoteric if the functionality is available at all (so most people won't have the skill to do this) and (b) probably considrably more trouble than just writing the query in SQL. The limitations of the query optimiser mean that you will probably never get the level of abstration out of SQL that you might get from (say) Python or even C# or Java.
The path of least resistance for efficient queries is likely to be writing the query in SQL with some procedural glue in one of the other languages. In some cases a computation really does lend itself to a procedural approach.
This can become a hassle and lead to large bodies of boilerplate SQL code. The only real options for this are hand coded SQL or code generation systems. A trivial example of code generation is the CRUD functionality provided by frameworks where this SQL is generated from metadata. A more complex example can be seen in ETL tools such as Oracle Warehouse Builder or Wherescape Red which work by generating great screeds of stored procedure code from the model.
I find myself building code generation systems of one sort or another on a semi-regular basis for precisely this reason. Any templating system will do for this - I've had fairly good mileage from CherryTemplate but there are many such items around. Code Generation in Action is quite a good book on this subject - the author uses a ruby-based system whose name escapes me.
Edit: If you look at a 'Show Estimated Execution Plan' for a block of procedural code you will notice that each statement has its own query plan. The query optimisation algorithm can only work on a single SQL statement, so a procedure will have a forest of query plans. Because procedural code can have 'side-effects' you cannot use the type of algorithms used in query optimisation to reason about the code. This means that a query optimiser cannot globally optimise a block of procedural code. It can only optimise individual SQL statements.
PostgreSQL has support for many scripting languages procedures. Officially Perl, Python, and Tcl. As addons, PHP, Ruby, Java and probably many others (just Google for pl<languagename>) which may or may not be in working condition as of now.
Oh, and also SQL Server 2005 onwards has support for CLR stored procedures, where you can use .NET languages.
Oracle, HSQLDB and Derby allow to write stored procedures in Java.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With