I am planning to build an application that will get a large amount of traffic. (Please don't say I won't get traffic, this is for an internal network, so the traffic will be there. Just trying to avoid the 'You won't get that much traffic, don't worry about it.)
As for what type of traffic I'm expecting, users will browse various dynamically created (based on user account details). On those sites the user may submit text inputs. Both loading the pages and handling user input will hit the database. Loads will obviously be reads, but handling input will require both reads & writes. Inputs may also affect other users views. If this happens, I will need to notify the other users to refresh the page.
What sorts of things do I need to do so that it doesn't simply crash under the load of a large amount of users?
What becomes the limiting factors? Database stuff? I/O with front end?
I've never really developed a serious web app before and am looking for some help.
EDIT: I was considering using Erlang for the backend since I've used it a little bit and really like all the concurrency stuff. Would this be a viable choice or should I try for something more traditional?
A scalable web application is a website that is able to handle an increase in users and load, whether in terms of a gradual or abrupt surge, without disrupting end-users' activities.
Website scaling is a way to handle additional workloads by adjusting your infrastructure. The increased workload could be anything from an influx of users to a large volume of simultaneous transactions or anything else that pushes the software beyond its designed capacity.
This is a very big topic, and you'll probably want to do as much research as time allows. There are several big topics to consider.
Session state storage. Obviously, session storage takes up memory or disk space. You need to have a strategy to store session information properly and in a way that can be used by a web farm.
Caching. A robust caching strategy can reduce loads dramatically. Do lots of research as to when, what and where you should be caching.
Scalability and load testing. Extra thought has to go into each resource fetching operation to make sure that it's being done as few times as necessary. Load testing and code profiling can help identify bottlenecks here if you use good tools.
Database optimization. Make sure you understand how to properly optimize your database for thousands (millions?) of operations per minute. If your application is write-heavy, you may need to look at warehousing older data that doesn't need to be included in indexes anymore to speed up your write operations.
Upgrade path. Is your traffic going to ramp up over time? Be sure to understand how you would plug in more servers and memory to your application if/when it's needed, and what would be required.
There are lots of books around that you could invest in that would probably pay off in big dividends. Do a search for "building scalable web applications" at amazon or chapters and you'll probably find lots of texts to go on, both technology specific and agnostic.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With