I am designing a database for a real estate application. It is proving to be more involved than I had anticipated (maybe I am overcomplicating things).
The problems essentially are due to the presence of:
I have ended up with a rather (unintentionally) elaborate classification tree for different property types. The tree nodes are the actual instances of property types.
I want to create a database so that I can query using not only any of the synonyms, but also the attributes.
So for example, the query (in pseudo SQL):
SELECT * from properties where synonym="flat" and attribute IN ('ground floor', 'garden');
should return a list of apartments|flats that are either ground floor AND have a garden.
Can someone help me with how to design the database schema so as to allow the kind of querying described above?
Last but not the least, I will be using either MySQl or PostgreSQL as the backend database, but would prefer the approach to be db agnostic - if possible.
I would take a different approach to your attribution scheme. Rather than treating different attributions as synonyms, I would treat them as overlapping, or more specifically, nested descriptions of a property. This would handle your business case while at the same time acknowledging the astute observation made by Mike Sherrill.
Here is a quick ERD sketch:
By way of a very fast data dictionary:
PROPERTY
is a piece of real estate.
CATEGORY
is a collection of descriptive attributes. The point of this table is more as an organizer of attributes than anything else. It could include things like "type of property", "ownership structure", "number of bathrooms", and whatever else might be of interest.
ATTRIBUTE
is a specific quality of interest. Note the involuted relationship on this entity type. I'll deal more with that later. The main point is that attributes can be more general or more specific and some attributes can be seen as refinements of other attributes.
DESCRIPTOR
is the intersection of a PROPERTY and the ATTRIBUTEs that have been associated with that particular piece of real estate.
So how is this supposed to help?
The key is how attributes work. If you use a nested set model, then you can address more or less specific attribution and searching criteria. Consider the following diagram of one potential CATEGORY with its associated ATTRIBUTEs:
In this example the CATEGORY is "type of property". You can see from the diagram that there is a hierarchical breakdown of attributes in this category. Each box in the diagram is a record in ATTRIBUTE. Boxes that contain other boxes have child attributes. Boxes that are inside another box have an FK to their containing box and so forth.
In this way, you could say "I want to find a property that is a Penthouse". You can then find PROPERTY records with a related DESCRIPTOR that points at the "Penthouse" ATTRIBUTE. That is pretty easy. But what if your search comes up empty?
The advantage of this approach is that you can then loosen your criteria by saying, "let's go up the attribution hierarchy to the next less-specific thing than penthouse". In my example, that would be "Highrise". Now you try your search again and you might have better luck.
A system like this gives you the ability to be as specific as you want in each category of attribution while relaxing the others far enough to start getting search hits. This is really what a real estate agent's work is about is it not? Helping the client to make the necessary compromises to find the best fit to their most important criteria?
Handling Nested Sets
The only tricky part of this approach is how to handle the nested sets. There are lots of ways to do this, many of which have been thoroughly documented elsewhere. I myself like the visitation number technique, especially for relatively static data sets. This makes it very easy to find matches for some given ATTRIBUTE or any of its children without having to do anything exotic in your SQL.
EDIT: So How Does This Work?
OP asked how do you handle things like number of bedrooms and what do the queries look like? Let's take another example for illustration:
The above shows the nested sets for the CATEGORY "Number of Bedrooms". I've also added the visitation numbers to the diagram. Note the way the visitation numbers work, in particular, note that the left (green) and right (red) numbers for any given attribute value contain the left and right visitation numbers for any subordinate attributes. For example, "2+ Bedrooms" has left and right numbers 6 and 15 respectively. Every attribute that falls under "2+ Bedrooms" has left and right numbers that fall within this range.
So how would you query for a properties with a given descriptor? Let's say we want to find all properties with two or more bedrooms. The SQL for such a query might look something like this:
select P.*
from PROPERTY P
inner join DESCRIPTOR D
on P.id = D.property_id
inner join ATTRIBUTE A
on D.attribute_id = A.id
where A.left >= (select X.left from ATTRIBUTE X
where X.name = '2+ Bedrooms')
and A.right <= (select Y.right from ATTRIBUTE Y
where Y.name = '2+ Bedrooms')
Note that the above query is a little different that what you might actually use. For example, you'd probably look up the filtering attribute using its int identity key instead of its string name. However, I thought I'd leave it as shown for clarity around the main point, which is you filter by looking not for a specific related attribute, but for any related attributes that fall within your filter range.
If you wanted to filter on multiple attributes, then just add more sub-clauses to your where clause.
To handle synonyms you could have many-to-many lookup between a table containing the static list of your property types, and a table containing the synonym. This way one synonym could be mapped to more than one property type.
For example:
Table:Property Type
1 House
2 Appartment
3 Large House
4 Cave
Table:Synonym
1 house
2 flat
3 dwelling
4 condo
5 mansion
Table:PropertyType-Synonym
1 1 (House is a house
1 3 (House is a dwelling)
2 2 (Appartment is a flat)
2 3 (Appartment is a dwelling)
2 4 (Appartment is a condo)
3 1 (Large House is a house)
3 3 (Large House is a dwelling)
3 5 (Large House is a mansion)
4 3 (Cave is a dwelling)
For properties, you could utilize a kind of open attribute structure.
For example:
Table:Property
1 Apartment F, Field House Gardens
2 123 Alphabet Street, NumberTown
Table:Attribute
1 Is ground floor?
2 Number of bedrooms
3 Has garden?
Table:Property-Attribute-Values
1 1 No
1 2 2
1 3 Yes
2 2 5
2 3 Yes
Hope this helps
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With