As a programmer I make revolutionary findings every few years. I'm either ahead of the curve, or behind it by about π in the phase. One hard lesson I learned was that scaling OUT is not always better, quite often the biggest performance gains are when we regrouped and scaled up.
What reasons to you have for scaling out vs. up? Price, performance, vision, projected usage? If so, how did this work for you?
We once scaled out to several hundred nodes that would serialize and cache necessary data out to each node and run maths processes on the records. Many, many billions of records needed to be (cross-)analyzed. It was the perfect business and technical case to employ scale-out. We kept optimizing until we processed about 24 hours of data in 26 hours wallclock. Really long story short, we leased a gigantic (for the time) IBM pSeries, put Oracle Enterprise on it, indexed our data and ended up processing the same 24 hours of data in about 6 hours. Revolution for me.
So many enterprise systems are OLTP and the data are not shard'd, but the desire by many is to cluster or scale-out. Is this a reaction to new techniques or perceived performance?
Do applications in general today or our programming matras lend themselves better for scale-out? Do we/should we take this trend always into account in the future?
Premature scaling is a common problem, with some analysts claiming that it accounts for 70% of startup failures. Companies scale too quickly when they bring on new people, spend money, and try to acquire more customers before they've really nailed down the product and business model.
We cannot rely on historic data to support present growth. Identifying, verifying and making available the current scale-up status of a business is one of the most important tools for boosting productivity, so access to better, faster data about scale-ups is crucial.
Most tech startups scale when they start gaining good traction for their products. This is the key to knowing when you should scale—when your existing processes start feeling small, when your customers expect more from you, and when you discover new potentials on the horizon.
Because scaling up
And also to some extent, because that's what Google do.
Scaling out is best for embarrassingly parallel problems. It takes some work, but a number of web services fit that category (thus the current popularity). Otherwise you run into Amdahl's law, which then means to gain speed you have to scale up not out. I suspect you ran into that problem. Also IO bound operations also tend to do well with scaling out largely because waiting for IO increases the % that is parallelizable.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With