I've long been perplexed by the speed of stackoverflow and how quickly the questions/comments load on the page. It seems like the backend db that stores all of this info would be humongus...How is it possible for a question and all of its associated answers to get loaded so quickly?
I've never worked in a large-scale db environment before (my background is small-business db like Access, some MySQL)...but I'd imagine the backend db for stackoverflow (simplified) is something like two tables linked by an indexed key, right? Something akin to:
Question Table: Question_PrimaryKey | QuestionText
Answer Table: Answer_PrimaryKey | Question_ForeignKey | AnswerText
(linked at Question_PrimaryKey & Question_ForeignKey).
Am I way off in thinking this is how a site like stackoverflow is set up? If so, how on earth are the answers to these questions fetched so quickly and put through to the browser? (it blows my mind, because when I build small intranet sites that use Access as a backend, the performance really starts to deteriorate when the db grows).
Any input would be greatly appreciated. Thanks for your time!
Good web performance obviously depends on a streamlined and well tuned database, but it is more to do with caching - basically storing frequently accessed data in memory, rather than have to pull it from a database on every request.
This blog post talks about SO's architecture.
Optimizing Your Website with Jeff Atwood and Stackoverflow
It's the return of Jeff Atwood. He and the team have been making lots of great speed optimizations to Stackoverflow lately. What tools are they using? What kinds of speed improvements are they seeing, and what can you do to exploit their experience?
And, from a hardware architeture standpoint: High Scalability - Stack Overflow Architeture
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With