Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the most ridiculous pessimization you've seen? [closed]

I think the phrase "premature optimization is the root of all evil" is way, way over used. For many projects, it has become an excuse not to take performance into account until late in a project.

This phrase is often a crutch for people to avoid work. I see this phrase used when people should really say "Gee, we really didn't think of that up front and don't have time to deal with it now".

I've seen many more "ridiculous" examples of dumb performance problems than examples of problems introduced due to "pessimization"

  • Reading the same registry key thousands (or 10's of thousands) of times during program launch.
  • Loading the same DLL hundreds or thousands of times
  • Wasting mega bytes of memory by keeping full paths to files needlessly
  • Not organizing data structures so they take up way more memory than they need
  • Sizing all strings that store file names or paths to MAX_PATH
  • Gratuitous polling for thing that have events, callbacks or other notification mechanisms

What I think is a better statement is this: "optimization without measuring and understanding isn't optimization at all - its just random change".

Good Performance work is time consuming - often more so that the development of the feature or component itself.


Databases are pessimization playland.

Favorites include:

  • Split a table into multiples (by date range, alphabetic range, etc.) because it's "too big".
  • Create an archive table for retired records, but continue to UNION it with the production table.
  • Duplicate entire databases by (division/customer/product/etc.)
  • Resist adding columns to an index because it makes it too big.
  • Create lots of summary tables because recalculating from raw data is too slow.
  • Create columns with subfields to save space.
  • Denormalize into fields-as-an-array.

That's off the top of my head.


I think there is no absolute rule: some things are best optimized upfront, and some are not.

For example, I worked in a company where we received data packets from satellites. Each packet cost a lot of money, so all the data was highly optimized (ie. packed). For example, latitude/longitude was not sent as absolute values (floats), but as offsets relative to the "north-west" corner of a "current" zone. We had to unpack all the data before it could be used. But I think this is not pessimization, it is intelligent optimization to reduce communication costs.

On the other hand, our software architects decided that the unpacked data should be formatted into a very readable XML document, and stored in our database as such (as opposed to having each field stored in a corresponding column). Their idea was that "XML is the future", "disk space is cheap", and "processor is cheap", so there was no need to optimize anything. The result was that our 16-bytes packets were turned into 2kB documents stored in one column, and for even simple queries we had to load megabytes of XML documents in memory! We received over 50 packets per second, so you can imagine how horrible the performance became (BTW, the company went bankrupt).

So again, there is no absolute rule. Yes, sometimes optimization too early is a mistake. But sometimes the "cpu/disk space/memory is cheap" motto is the real root of all evil.


On an old project we inherited some (otherwise excellent) embedded systems programmers who had massive Z-8000 experience.

Our new environment was 32-bit Sparc Solaris.

One of the guys went and changed all ints to shorts to speed up our code, since grabbing 16 bits from RAM was quicker than grabbing 32 bits.

I had to write a demo program to show that grabbing 32-bit values on a 32-bit system was faster than grabbing 16-bit values, and explain that to grab a 16-bit value the CPU had to make a 32-bit wide memory access and then mask out or shift the bits not needed for the 16-bit value.


Oh good Lord, I think I have seen them all. More often than not it is an effort to fix performance problems by someone that is too darn lazy to troubleshoot their way down to the CAUSE of those performance problems or even researching whether there actually IS a performance problem. In many of these cases I wonder if it isn't just a case of that person wanting to try a particular technology and desperately looking for a nail that fits their shiny new hammer.

Here's a recent example:

Data architect comes to me with an elaborate proposal to vertically partition a key table in a fairly large and complex application. He wants to know what type of development effort would be necessary to adjust for the change. The conversation went like this:

Me: Why are you considering this? What is the problem you are trying to solve?

Him: Table X is too wide, we are partitioning it for performance reasons.

Me: What makes you think it is too wide?

Him: The consultant said that is way too many columns to have in one table.

Me: And this is affecting performance?

Him: Yes, users have reported intermittent slowdowns in the XYZ module of the application.

Me: How do you know the width of the table is the source of the problem?

Him: That is the key table used by the XYZ module, and it is like 200 columns. It must be the problem.

Me (Explaining): But module XYZ in particular uses most of the columns in that table, and the columns it uses are unpredictable because the user configures the app to show the data they want to display from that table. It is likely that 95% of the time we'd wind up joining all the tables back together anyway which would hurt performance.

Him: The consultant said it is too wide and we need to change it.

Me: Who is this consultant? I didn't know we hired a consultant, nor did they talk to the development team at all.

Him: Well, we haven't hired them yet. This is part of a proposal they offered, but they insisted we needed to re-architect this database.

Me: Uh huh. So the consultant who sells database re-design services thinks we need a database re-design....

The conversation went on and on like this. Afterward, I took another look at the table in question and determined that it probably could be narrowed with some simple normalization with no need for exotic partitioning strategies. This, of course turned out to be a moot point once I investigated the performance problems (previously unreported) and tracked them down to two factors:

  1. Missing indexes on a few key columns.
  2. A few rogue data analysts who were periodically locking key tables (including the "too-wide" one) by querying the production database directly with MSAccess.

Of course the architect is still pushing for a vertical partitioning of the table hanging on to the "too wide" meta-problem. He even bolstered his case by getting a proposal from another database consultant who was able to determine we needed major design changes to the database without looking at the app or running any performance analysis.