This question is intended to be software / platform agnostic. I am just looking for generic SQL code.
Consider the following (very simple for example's sake) tables:
Table: Authors id | name 1 | Tyson 2 | Gordon 3 | Tony etc Table: Books id | author | title 1 | 1 | Tyson's First Book 2 | 2 | Gordon's Book 3 | 1 | Tyson's Second Book 4 | 3 | Tony's Book etc Table: Stores id | name 1 | Books Overflow 2 | Books Exchange etc Table: Stores_Books id | store | book 1 | 1 | 1 2 | 2 | 4 3 | 1 | 3 4 | 2 | 2
As you can see, there is a one-to-many relationship between Book
s and Author
s, and a many-to-many relationship between Book
s and Store
s.
Question one: What is the best query to eager load one author and their books (and where the books are sold) into an object-oriented program where each row is representative of an object instance?
Question two: What is the best query to eager load the entire object tree into an object-oriented program where each row is representative of an object instance?
Both of these situations are easy to imagine with lazy loading. In either situation you would fetch the author with one query and then as soon as you need their books (and what stores the books are sold at) you would use another query to get that information.
Is lazy loading the best way to do this or should I use a join and parse the result when creating the object tree (in an attempt to eager load the data)? In this situation what would be the optimal join / target output from the database in order to make parsing as simple as possible?
As far as I can tell, with eager loading, I would need to manage a dictionary or index of some sort of all the objects while I am parsing the data. Is this actually the case or is there a better way?
That's a tough question to answer. I've done this before by writing a query that returns everything as a flat table and then looping through the results, creating objects or structures as the most-significant columns change. I think that works better than multiple database calls because there's a lot of overhead involved in each call, though depending on how many smaller entities there are to each big entity that might not be best.
The following might apply to both your questions 1 and 2.
SELECT a.id, a.name, b.id, b.name FROM authors a LEFT JOIN books b ON a.id=b.author
(pseudocode, in your program that makes the db call)
while (%row=fetchrow) {
if ($row{a.id} != currentauthor.id) {
currentauthor.id=$row{a.id};
currentauthor.name=$row{a.name};
}
currentbook=new book($row{b.id, b.name});
push currentauthor.booklist, currentbook;
}
[edit] I just realized I didn't answer the second part of your question. Depending on the size of the data for stores and what I intended doing with it, I would either
Before looping through books/authors as above, slurp the whole stores table into a structure in my program, much like the book/author structure above but indexed by the storeid, and then do a lookup in that structure every time I read a book record and store a reference to the store table
or, if there are many stores,
Join the stores onto the books and have an additional nested loop to add stores objects within the part of the code that adds a book.
Here's a relevant Wikipedia article: http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch
I hope that helps!
In an OO program you don't use SQL, instead you let that be done invisibly by your Persistence mechanism. To explain:
If you have an object-oriented program then you want an object model that natuarally represents the concepts of Author, Book and Store. You then have an "Object/Relational mapping" problem. Somehow you want to get data from the database using SQL and yet work naturally with your objects.
In the Java world we do that with the Java Persistence API (JPA). You don't actually write the SQL instead you just "annotate" the Java Class to say "This class corresponds to that Table, this attribute to that column", and then do some interesting things with the JOINs and can in fact choose either Lazy or Eager loading as it makes sense.
So you might end up with an Author class (I'm making attributes public here for brevity, in real life we have private attributes and getters and setters.
@Entity
public Class Author {
public int id;
public String name;
// more in a minute
That class is annotated as an entity and so JPA with match up the atrributes in the objects with their columns in the corresponding table. The annotations have more capabilities so that you can specify mappings between names of attributes and columns that don't exactly match; mappings such as
PUBLISHED_AUTHOR => Author,
FULL_NAME => name
Now what about JOINS and relationships? The author class has a collection of Books
@Entity
public Class Author {
public int id;
public String name;
public List<Book> books;
and the Book class has an attribute that is it's author
@Entity
public Class Book {
public int id;
public String title
public Author author
The JPA Entity Manager class fetches an instance of Book using a find method (I'll not go into detail here)
int primaryKey = 1;
Book aBook = em.find( primaryKey); // approximately
Now your code can just go
aBook.author.name
You never see the fact that SQL was used to fetch the data for Book, and by the time you ask for the author attribute has also fetched the author data. A SQL JOIN may well have been used, you don't need to know. You can control whether the fetch is Eager or Lazy by more annotations.
Similarly
int primaryKey = 2
Author author = em.find( primaryKey );
author.books.size() ; // how many books did the author write?
we get a list of all the books as well as the authors other data, SQL happened, we didn't see it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With