Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Facebook Architecture [closed]

Well Facebook has undergone MANY many changes and it wasn't originally designed to be efficient. It was designed to do it's job. I have absolutely no idea what the code looks like and you probably won't find much info about it (for obvious security and copyright reasons), but just take a look at the API. Look at how often it changes and how much of it doesn't work properly, anymore, or at all.

I think the biggest ace up their sleeve is the Hiphop. http://developers.facebook.com/blog/post/358 You can use HipHop yourself: https://github.com/facebook/hiphop-php/wiki

But if you ask me it's a very ambitious and probably time wasting task. Hiphop only supports so much, it can't simply convert everything to C++. So what does this tell us? Well, it tells us that Facebook is NOT fully taking advantage of the PHP language. It's not using the latest 5.3 and I'm willing to bet there's still a lot that is PHP 4 compatible. Otherwise, they couldn't use HipHop. HipHop IS A GOOD IDEA and needs to grow and expand, but in it's current state it's not really useful for that many people who are building NEW PHP apps.

There's also PHP to JAVA via things like Resin/Quercus. Again, it doesn't support everything...

Another thing to note is that if you use any non-standard PHP module, you aren't going to be able to convert that code to C++ or Java either. However...Let's take a look at PHP modules. They are ARE compiled in C++. So if you can build PHP modules that do things (like parse XML, etc.) then you are basically (minus some interaction) working at the same speed. Of course you can't just make a PHP module for every possible need and your entire app because you would have to recompile and it would be much more difficult to code, etc.

However...There are some handy PHP modules that can help with speed concerns. Though at the end of the day, we have this awesome thing known as "the cloud" and with it, we can scale our applications (PHP included) so it doesn't matter as much anymore. Hardware is becoming cheaper and cheaper. Amazon just lowered it's prices (again) speaking of.

So as long as you code your PHP app around the idea that it will need to one day scale...Then I think you're fine and I'm not really sure I'd even look at Facebook and what they did because when they did it, it was a completely different world and now trying to hold up that infrastructure and maintain it...Well, you get things like HipHop.

Now how is HipHop going to help you? It won't. It can't. You're starting fresh, you can use PHP 5.3. I'd highly recommend looking into PHP 5.3 frameworks and all the new benefits that PHP 5.3 brings to the table along with the SPL libraries and also think about your database too. You're most likely serving up content from a database, so check out MongoDB and other types of databases that are schema-less and document-oriented. They are much much faster and better for the most "common" type of web site/app.

Look at NEW companies like Foursquare and Smugmug and some other companies that are utilizing NEW technology and HOW they are using it. For as successful as Facebook is, I honestly would not look at them for "how" to build an efficient web site/app. I'm not saying they don't have very (very) talented people that work there that are solving (their) problems creatively...I'm also not saying that Facebook isn't a great idea in general and that it's not successful and that you shouldn't get ideas from it....I'm just saying that if you could view their entire source code, you probably wouldn't benefit from it.


Facebook is using LAMP structure. Facebook’s back-end services are written in a variety of different programming languages including C++, Java, Python, and Erlang and they are used according to requirement. With LAMP Facebook uses some technologies ,to support large number of requests, like

  1. Memcache - It is a memory caching system that is used to speed up dynamic database-driven websites (like Facebook) by caching data and objects in RAM to reduce reading time. Memcache is Facebook’s primary form of caching and helps alleviate the database load. Having a caching system allows Facebook to be as fast as it is at recalling your data.

  2. Thrift (protocol) - It is a lightweight remote procedure call framework for scalable cross-language services development. Thrift supports C++, PHP, Python, Perl, Java, Ruby, Erlang, and others.

  3. Cassandra (database) - It is a database management system designed to handle large amounts of data spread out across many servers.

  4. HipHop for PHP - It is a source code transformer for PHP script code and was created to save server resources. HipHop transforms PHP source code into optimized C++. After doing this, it uses g++ to compile it to machine code.

If we go into more detail, then answer to this question go longer. We can understand more from following posts:

  1. How Does Facebook Work?
  2. Data Management, Facebook-style
  3. Facebook database design?
  4. Facebook wall's database structure
  5. Facebook "like" data structure

"Knowing about sites which handles such massive traffic gives lots of pointers for architects etc. to keep in mind certain stuff while designing new sites"

I think you can probably learn a lot from the design of Facebook, just as you can from the design of any successful large software system. However, it seems to me that you should not keep the current design of Facebook in mind when designing new systems.

Why do you want to be able to handle the traffic that Facebook has to handle? Odds are that you will never have to, no matter how talented a programmer you may be. Facebook itself was not designed from the start for such massive scalability, which is perhaps the most important lesson to learn from it.

If you want to learn about a non-trivial software system I can recommend the book "Dissecting a C# Application" about the development of the SharpDevelop IDE. It is out of print, but it is available for free online. The book gives you a glimpse into a real application and provides insights about IDEs which are useful for a programmer.