I know this subject has been discussed at length (I have read lots of posts here and all over the web) and I don't like to flog a dead horse but I have a question about a more specific aspect of Integer ID vs GUID:
I am writing out a schema which is going to have a tagging ability similar in use to the stackoverflow tags but it will have the same tags used against 5+ different tables.
The basic tables I will be linking are as follows:
Tag Table
Tag ID Tag Name Tag Description
-------------------------------------------------------------
1 Hats Tag for hats
2 Coats Tag for coats
3 Gloves Gloves tag
4 Ladies Ladies item
Items Table 1
Item ID Item Name Cost
------------------------------------------------------------
1 Deerstalker £20.00
2 Fedora £50.00
3 Scarf £15.00
The bit I'm having trouble with is the tag_item
table.
I will have 5 tables with completely different structures that I want the users to be able to apply tags to so I think I'll need to do one of the following:
table name
/table number
as well as the integer key of the row the tag relates toGUID
of the row, this will work independent of the table and make it much easier to get all tags for a certain row.What I'm unsure of is how this will affect performance for:
Is there any clearly better option in this case or anywhere I could read up on the advantages in this particular scenario?
Edit:
For anyone interested, below is the schema I have decided upon:
Table for Tags
CREATE TABLE [dbo].[Sys_TagList](
[Sys_Tag_Primary] [int] IDENTITY(1,1) NOT NULL,
[Sys_Tag_Name] [varchar](50) NOT NULL,
[Sys_Tag_Description] [varchar](1000) NULL
)
Table for Tag_Items
CREATE TABLE [dbo].[Meta_Tags](
[Met_Tag_Primary] [int] IDENTITY(1,1) NOT NULL,
[Met_Tag_Link_FK] [int] NOT NULL, -- Link to [Sys_Tag_Primary] field
[Met_Tag_Name] [varchar](25) NOT NULL, -- To enable tag searches to use a single table
[Met_Tag_Table] [varchar](25) NOT NULL, -- Table name, used with [Met_Tag_RowID] to create link to that item
[Met_Tag_RowID] [int] NOT NULL -- Primary key of item[n] table
)
I'm going to (perhaps unhelpfully) suggest that GUID vs ID is the wrong question to be focusing on. The assumption that you need a single table linking tags to items may be worth examining first.
If your five item categories are different enough to require different tables then you should probably consider five tag_item link tables - one for each item table. Although the code to query them all at once (using UNION ALL
to concatenate the results) may be a bit more verbose and repetetive than the code for a single link table, the queries are likely to be substantially simpler - good from both a maintenance and a query-plan perspective.
The other alternative would be to refactor the common attributes that the five item categories share into a single central table (to which the tags are linked) and store the additional non-common attributes in five child tables. It's difficult to know whether this is appropriate without some more information about how the existing data is currently used.
Obligatory possible tagging design duplicate
Depending on how you intend to use the tags and what performance characteristics are important, one possible alternative to the map/link table is a hashed combination table.
Calculate the hash of a given requested combination of tags and store that directly in your existing tables, and in the combination table.
Querying for anything that is a subset of tags might start to get kind of hacky though as you'll end up doing a lot of in memory transforms of all those hashes. Perhaps there is a cleaner way of querying hashed reference values out there, as I cant recall doing this, and didn't immediately locate anything in a search.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With