I'm trying to micro-optimize my code at a very low level point in the application architecture. So here is my concrete scenario:
[Before describing my possible solutions, I want to explain why I'm doing micro-optimization here (you may skip this paragraph): The parser class has a lot of small methods, where "small" means that they don't do much. Most of them only read one or two bytes or even only one bit from a cached bit stream. So it should be possible to implement them in a very very efficient way, where a function call, when inlined, only needs a handful of machine commands. The methods are called very often in the application, since they look up node attributes in a very big graph (the world-wide road network), which might happen about one million times per user request, and such an request should be as fast as possible.]
Which is the way to go here? I can see the following methods to solve the problem:
Are there better ways to solve this problem? Is there any idiom for this?
To clarify, I have a lot of functions which are version-independent (at least until now), and are thus perfectly fitting in some super class. I will use a standard sub-classing design for most functions, while this questions only covers a solution for the version-dependent functions to be optimised. (Some of them aren't called very frequently and I can of course use virtual methods in these cases.) Besides this, I don't like the idea to make the parser class decide which methods need to be performant and which don't. (Although it would be possible to do so.)
One option that might work well is the following: have each parser class define methods with the same signatures, but does so completely independently of each other class. Then, introduce a secondary class hierarchy that implements all of these same functions virtually, then forwards each method call to a concrete parser object. That way, the implementation of the parser gets all the benefits of inlining, since from the perspective of the class all calls can be resolved statically, while the client gets the benefits of polymorphism, since any method call will dynamically resolve to the proper type.
The catch in doing this is that you use extra memory (the wrapper object takes up space), and you will also probably have at least one extra indirection involved when you call the parser functions, since the call goes
client → wrapper → implementation
Depending on how infrequently you call the methods from the client, this implementation might work very well.
Using templates, it's possible to implement the wrapper layer extremely succinctly. The idea is the following. Suppose that you have methods fA, fB, and fC. Start off by defining a base class like this:
class WrapperBase {
public:
virtual ~WrapperBase() = 0;
virtual void fA() = 0;
virtual void fB() = 0;
virtual void fC() = 0;
};
Now, define the following template type as a subclass:
template <typename Implementation>
class WrapperDerived: public WrapperBase {
private:
Implementation impl;
public:
virtual void fA() {
impl.fA();
}
virtual void fB() {
impl.fB();
}
virtual void fC() {
impl.fC();
}
};
Now, you can do something like this:
WrapperBase* wrapper = new WrapperDerived<MyFirstImplementation>();
wrapper->fA();
delete wrapper;
wrapper = new WrapperDerived<MySecondImplementation>();
wrapper->fB();
delete wrapper;
In other words, all of the wrapper code can be generated for you by the compiler by just instantiating the WrapperDerived
template.
Hope this helps!
First, of couse, you should profile your code to figure-out how much are the vcalls performance-killing in your particular case (besides of potentially weaker optimizations).
Putting the optimization subject aside, I'm almost sure you won't get any significant performance gain by replacing virtual function call (or call a function by a pointer variable, which is almost the same) with a switch that calls compile-time-known functions in different cases.
If you really want a significant improvement - those are the most promising variants IMHO:
Try to redesign your interface to enable more complex functions. For instance, if you have a function that reads a single vertex - modify it to read (up to) N vertexes at once. And so on.
You may make your whole parsing code (that uses your parser) a template
class/function, that will use a template parameter to instantiate the needed parser. Here you'll need neither interface nor virtual functions. At the very beginning (where you identify the version) - put a switch
, for every recognized version call this function with the appropriate template parameter.
The latter will probably be superior from the performance point of view, OTOH this increases the code size
EDIT:
Here's an example of (2):
template <class Parser>
void MyApplication::HandleSomeRequest(Parser& p)
{
int n = p.GetVertexCount();
for (iVertex = 0; iVertex < n; iVertex++)
{
// ...
p.GetVertexEdges(iVertex, /* ... */);
// ...
}
}
void MyApplication::HandleSomeRequest(/* .. */)
{
int iVersion = /* ... */;
switch (iVersion)
{
case 1:
{
ParserV1 p(/* ... */);
HandleSomeRequest(p);
}
break;
case 2:
{
ParserV2 p(/* ... */);
HandleSomeRequest(p);
}
break;
// ...
}
}
The classes ParserV1
, ParserV2
and etc. do not have virtual
functions. They also don't inherit any interface. They just implement some functions, such as GetVertexCount
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With