With the recent prevelance of NoSQL databases why would I use a SQL database?

Tags:

After developing software for about 5 years now, I have spent probably atleast 20% and perhaps up to 40% of that time simply making a RDBMS able to save and retrieve complex object graphs. Many times this resulted in less than optimal coding solutions in order to make something easier to do from the database side. This eventually ended after a very significant amount of time spent in learning NHibernate and the session management patterns that are part of it. With NHibernate I was able to finally eschew the large majority of 100% wasted time of writing CRUD for the 1000th time and use forward generation of my database from my domain model.

Yet all of this work still results in a flawed model where my database is merely the best attempt by SQL to imitate my actual object. With document databases this is no longer the case as the object becomes the document itself instead of merely emulating the object through tables and columns.

At this point I'm really starting to question why would I ever need SQL again?

What can really be done substantially better with SQL than a document database?

I know this is somewhat of leading into a apples to oranges comparison especially when you factor in the various types of NoSQL databases having widely different feature-sets but for the sake of this argument base it on the notion of NoSQL databases can inherently query objects correctly and not on the limitations of a key value store. Also leave out the reporting aspect as that should generally be handled in a OLAP database unless your answer includes a specific reason you would not use a OLAP database for it.

901

asked Jul 20 '10 22:07

Chris Marisic

3 Answers

Relational data modeling is a formal, mathematical solution for representing complex data without redundancy and without allowing anomalies. You can design an optimal database design from the data relationships themselves. This is the process of relational database normalization.

Non-relational data modeling has no formal way to define the best database structure from the data. You can design a database based on your anticipated usage; that is, your queries determine the best data organization, not the data itself.

In non-relational databases, you can never be sure that data conforms a certain document structure. You could have documents left over in the database from an earlier revision. So your application code had better be able to "discover" the structure of each document, perform conversions if necessary, and hope that references between data collections are satisfied.

In relational databases, you can depend on data integrity being an integral part of the model. If you design for normalization and you set up constraints properly, you know you'll never have orphans or data anomalies.

Non-relational databases give you one type of efficiency, as you're designing the database. Relational databases give you another type of efficiency, as you're using the database.

That said, the specific type of problem you've been working with -- object graphs -- is tricky to accomplish efficiently with plain SQL. But I think you'll find it's not much easier with NoSQL databases.

Re your comment: Granted, consistency is not a priority for every app. That doesn't make the value of consistency "insubstantial" for the apps where it is important.

You asked about why you would use relational databases -- you'd use them when the benefits of relational databases fit the priorities of your project.

Don't drive a nail with a screwdriver, and don't turn a screw with a hammer. There's an appropriate tool to solve each type of problem.

answered Oct 20 '22 03:10

Bill Karwin

At Amazon I worked with a lot of code. Most of the code I worked one was code nobody really understood anymore. It was riddled with special case handling that wasn't well understood because it was an accretion of quick patches over a long period of time. If you wanted to fully understand the effect of a change you were making you were out-of-luck. In essence, you were forced to add to the accretion.

I also worked with a lot of data. The structure of the tables in SQL made excellent long-term documentation for the data. The database was relatively easy to work with directly, and the structure of the data made sense. There were people who's job it was to manage the structure and integrity of the data.

I fear that a NoSQL database, with its lack of well-documented structure, would slowly acquire all the evil qualities of the code I worked on. It would end up filled with data from old structures that nobody really understood anymore, and become a vast patchwork of mostly useless garbage.

I see the main benefits of SQL databases as the forced documentation that maintaining the database structure and consistency rules requires. Those benefits do not have an easy short-term measure like speed of a query or transactional consistency. They are long-term benefits that affect the usefulness of your data over an extended period of time.

As a second, related point, I find it more useful, when using ORMs and the like, to map out my data and then decide how that will translate into objects in the application I'm writing. The data and its relationships represent a long-term archival structure that may be used for a variety of purposes.

The structure of the object relationships in the application are there for the purposes of that application. A given set of data represented in SQL tables and relationship constraints will have many possible object models that represent it in an application, and each of those object models will reflect the goals of that particular application. But the data and its structure exist independently of any given ephemeral use that might be made of them.

I see the arguments people make about 'reporting' as being arguments that different applications can usefully view the same set of data in very different ways.

Personally, I think SQL is a good model to use directly for archival data, infrequently modified data, or data with extremely high consistency requirements. And I think that I will continue to use relational algebra to define the overall structure of my data even if I'm storing it in a NoSQL database. And I will not change the structure of the data in the NoSQL database without first modifying the relational structure describing it. This will allow me to map my NoSQL databases back to SQL so I can still use SQL for long-term storage and warehousing and force me to maintain the data structures in a well documented form.

Doing things this way will also assist me when I have to pull data out of the NoSQL database for use in applications that were not envisioned when the database was created.

Of course, there is some data who's structure naturally fits NoSQL and where generating a relational schema for it would be pointless. For example, storage of actual documents, storage of pictures or other media, or other large blobs of data that has no structure that might be useful to represent. This distinction is very tricky though. Pictures and movies do have structure to them, just not generally structure you need to store in a database. A blog post may have structure as well if you have a system designed to try to read and understand it, and that may well be structure you want to maintain a record of.

136

answered Oct 20 '22 05:10

Omnifarious

it depends on what you are trying to do. when you need to do searching on different fields of your objects then SQL is good. if you don't need to do searching and you have very complex polymorphic tree like structures then SQL is horrible.

i've worked on app that allowed users to build web pages by joining little fragments together and the original serialization used key/value SQL tables. all the fragments had properties which were stored (fragment, property, value). so schemaless but still a lot of heavy lifting. probably the worst of both worlds because you don't really get much data validation from the database, it is very difficult to look at the tables and understand what is going on and there is still a lot of work to write it to the db and read it back.

we've also done a similar app but we learnt our lesson and we just take plain java classes and encode them using JSON. the user just edits their page in the front in a rich ui. clicks save and the whole page is sent back to the server as a json object. the server then does validation on the object to make sure all the constraints are correct which should always be true unless a user has been tampering or there is a bug in the code. then the object is written to a row by encoding to back to json.

this works well for us because we never want to deal with part of the object. we always deal with the whole of the object so JSON is not only easier but it is faster than doing the 40+ queries on each read we would have to do if it was properly normalized.

answered Oct 20 '22 04:10

benmmurphy

Related questions
                            
                                How to use a value from one stored procedure in another?
                            
                                MySQL: How to SUM() a TIMEDIFF() on a group?
                            
                                Insert distinct values from one table into another table
                            
                                Hibernate Criteria API: get n random rows
                            
                                "Merge" style operation with literal values?
                            
                                Get size of large object in PostgreSQL query?
                            
                                How to replace null values with a text?
                            
                                Convert SQL Server DateTime object to BIGINT (.Net ticks)
                            
                                Finding node order in XML document in SQL Server
                            
                                How to select the relative complement of a table B in a table A (A \ B) in an SQL-Query?
                            
                                pyodbc - How to perform a select statement using a variable for a parameter
                            
                                How can I send email from PostgreSQL trigger?
                            
                                SQLite3 "forgets" to use foreign keys
                            
                                How to select a row based on its row number?
                            
                                Splitting the string in sql server
                            
                                Get the last dates from multiple columns
                            
                                SQL statement to get all customers with no orders
                            
                                SQL: Is there a possibility to convert numbers (1,2,3,4...) to letters (A,B,C,D...)
                            
                                SQLite foreign key examples
                            
                                How can I determine if a Sequence exist in SQL Server 2012?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

With the recent prevelance of NoSQL databases why would I use a SQL database?

Tags:

sql

nosql

rdbms