Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Questions about FriendFeed's MySql SchemaLess Design

Tags:

mysql

nosql

Bret Taylor discussed the SchemaLess Design in this blog post: http://bret.appspot.com/entry/how-friendfeed-uses-mysql

It looks like they stored different class's Objects into only one table.Then build more index tables.

my question is that how to build index on one class.

for example, a user's blog is {id,userid,title,body}. A user's tweet is {id,userid,tweet}.

If I want to build an index for users' blogs how can I do?

like image 399
user404017 Avatar asked Jul 28 '10 01:07

user404017


1 Answers

It's very simple -- perhaps simpler than you expect.

When you store a blog entity, you're going to insert to the main entities table of course. A blog goes like this:

CREATE TABLE entities (
  id INT AUTO_INCREMENT PRIMARY KEY,
  entity_json TEXT NOT NULL
);

INSERT INTO entities (id, entity_json) VALUES (DEFAULT,
    '{userid: 8675309, 
      post_date: "2010-07-27", 
      title: "MySQL is NoSQL", 
      body: ... }'
);

You also insert into a separate index table for each logical type of attribute. Using your example, the userid for a blog is not the same as a userid for a tweet. Since you just inserted a blog, you then insert into index table(s) for blog attribute(s):

CREATE TABLE blog_userid (
  id INT NOT NULL PRIMARY KEY,
  userid BIGINT UNSIGNED,
  KEY (userid, id)
);

INSERT INTO blog_userid (id, userid) VALUES (LAST_INSERT_ID(), 8675309);

CREATE TABLE blog_date (
  id INT NOT NULL PRIMARY KEY,
  post_date DATETIME UNSIGNED,
  KEY (post_date, id)
);

INSERT INTO blog_date (id, post_date) VALUES (LAST_INSERT_ID(), '2010-07-27');

Don't insert into any tweet index tables, because you just created a blog, not a tweet.

You know all rows in blog_userid reference blogs, because that's how you inserted them. So you can search for blogs of a given user:

SELECT e.*
FROM blog_userid u JOIN entities e ON u.id = e.id
WHERE u.userid = 86765309;

Re your comment:

Yes, you could add real columns to the entities table for any attributes that you know apply to all content types. For example:

CREATE TABLE entities (
  id INT AUTO_INCREMENT PRIMARY KEY,
  entity_type INT NOT NULL,
  creation_date TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
  entity_json TEXT NOT NULL
);

The columns for entity_type and creation_date would allow you to crawl the entities in chronological order (or reverse chronological order) and know which set of index tables matches the entity type of a given row.

like image 136
Bill Karwin Avatar answered Oct 14 '22 14:10

Bill Karwin