kala-tamin What is the most efficient way to store tags in a database?

What is the most efficient way to store tags in a database?

I am implementing a tagging system on my website similar to one stackoverflow uses, my question is - what is the most effective way to store tags so that they may be searched and filtered?

My idea is this:

Table: Items Columns: Item_ID, Title, Content  Table: Tags Columns: Title, Item_ID 

Is this too slow? Is there a better way?

Best authentication solution for RESTful Database Server


Is it possible to serialize and deserialize a void pointer to & from a blob in a database?
One item is going to have many tags.

Maps: Does calculating distance between 2 points factor in altitude?
And one tag will belong to many items.

Unique Constraint on two fields and their reverse
This implies to me that you'll quite possibly need an intermediary table to overcome the many-to-many obstacle..
Copy records from two database in sqlite
Something like:.
Postgresql one db with multiple schemas vs multiple db with one schema
Table: Items
Columns: Item_ID, Item_Title, Content.
How to store and compress data for real time data logging?
Table: Tags
Columns: Tag_ID, Tag_Title.
Storing R Objects in a relational database
Table: Items_Tags
Columns: Item_ID, Tag_ID.
It might be that your web app is insanely popular and need denormalising down the road, but it's pointless muddying the waters too early..


You should read Philipp Keller's blog posts about tagging database schemas.

He tries a few and reports his results, both in terms of ease of constructing common queries, and in terms of performance.

Number of tags, number of tagged items, and number of tags per item were all factors.

The posts are from 2005; I'm not aware of any updates since then..


Actually I believe de-normalising the tags table might be a better way forward, depending on scale.. This way, the tags table simply has tagid, itemid, tagname.. You'll get duplicate tagnames, but it makes adding/removing/editing tags for specific items MUCH more simple.

You don't have to create a new tag, remove the allocation of the old one and re-allocate a new one, you just edit the tagname.. For displaying a list of tags, you simply use DISTINCT or GROUP BY, and of course you can count how many times a tag is used easily, too..


I'd suggest using intermediary third table for storing tags<=>items associations, since we have many-to-many relations between tags and items, i.e.

one item can be associated with multiple tags and one tag can be associated with multiple items.

HTH, Valve..


If space is going to be an issue, have a 3rd table Tags(Tag_Id, Title) to store the text for the tag and then change your Tags table to be (Tag_Id, Item_Id).

Those two values should provide a unique composite primary key as well..


You can't really talk about slowness based on the data you provided in a question.

And I don't think you should even worry too much about performance at this stage of developement.

It's called premature optimization.. However, I'd suggest that you'd include Tag_ID column in the Tags table.

It's usually a good practice that every table has an ID column..


Items should have an "ID" field, and Tags should have an "ID" field (Primary Key, Clustered).. Then make an intermediate table of ItemID/TagID and put the "Perfect Index" on there..


If you don't mind using a bit of non-standard stuff, Postgres version 9.4 and up has an option of storing a record of type JSON text array.. Your schema would be:.
Table: Items Columns: Item_ID:int, Title:text, Content:text  Table: Tags Columns: Item_ID:int, Tag_Title:text[] 
For more info, see this excellent post by Josh Berkus: http://www.databasesoup.com/2015/01/tag-all-things.html. There are more various options compared thoroughly for performance and the one suggested above is the best overall..

71 out of 100 based on 71 user ratings 821 reviews