Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Data object storage - Can table JOIN's do what single table SELECT's cannot?

Now that "NOSQL" or "object only" storage systems like MongoDB or memcached are really picking up steam in the world. I was wondering if there are any requests that cannot be performed on them that can be performed using multiple object joins (in SQL that is JOIN "table"). In other words, are there any multi-table queries that cannot be handled by several single table queries in a row?

Basically, is there a use-case were a multi-table join cannot be replicated by accessing one table at a time in object based storage systems?

Here are some examples of normal 3NF queries using has_man and has_many_through relations. These aren't the most complex queries - but they should give you a starting point for the concept. Note that any value in {} means a value of the result of the last query.


Company Has Many Users

SELECT user.*, company.name as company_name FROM user 
LEFT JOIN company ON company.id = user.company_id
WHERE user.id = 4

vs

SELECT * FROM user WHERE id = 4
SELECT * FROM company WHERE id = {user.comany_id}

Club Has Many Students Through Memberships

SELECT student.* FROM student LEFT JOIN membership on
membership.student_id = sudent.id WHERE membership.club_id = 5

vs

SELECT * FROM membership WHERE club.id = 5
SELECT * FROM student WHERE id = {membership.student_id}

The reason I'm wondering is because I want to know if Object-based systems (that rely on accessing single table objects at a time) can do what RDBMS databases like PostgreSQL or MySQL can do.

So far the only thing wrong seems to be that more queries are necessary.

like image 908
Xeoncross Avatar asked Aug 12 '10 16:08

Xeoncross


1 Answers

Just because you can, doesn't mean you should.

The multiple SELECT statement alternative cons:

  • the less trips to the database, the better. TCP overhead can not be recouped, and it looks like Network Neutrality is officially dead so we could expect to see a movement away from multi-select/nosql because you might have to pay for that bandwidth...
  • because of delay between initial and subsequent statements, risk of supporting data not reflecting what's in the system when the first query was run
  • less scalable--the larger the data set, the more work the application is doing to deal with business rules and association that can scale far better in a database
  • more complexity in the application, which also makes the business less portable (IE: migrate from Java to .NET or vice versa - you're looking at building from scratch when business logic in the DB would minimize that)
like image 168
OMG Ponies Avatar answered Oct 03 '22 19:10

OMG Ponies