Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should I use static variables in my functions to prevent recomputing values?

I've been writing C++ code for a while now, but there's something I've been wondering for some time, without being to find a clear answer.

My point here is the following: let's say I have a function (could be a method, could be static, but not necessarily), and that function uses some "heavy" object(s) (such as a string that can't be determined easily at compile time, but that is constant throughout the execution). An example I've actually come across is the following:

/* Returns an endpoint for an API
 * Based on the main API URL (getApiUrl())
 */
virtual QString getEndPointUrl() const override
{
    QString baseUrl = getApiUrl();
    QString endpointUrl = QString("%1/%2").arg(baseUrl, "endpoint");
    return endpointUrl;
}

It's of course just an example (I know that QStrings have their own fancy Qt memory management features, but let's admit we're dealing with basic objects).

Is it a good idea to do the following?

virtual QString getEndPointUrl() const override
{
    /* We determine baseUrl only once */
    static const QString baseUrl = getApiUrl();
    /* We compute endpointUrl only once */
    static const QString endpointUrl = QString("%1/%2").arg(baseUrl, "endpoint");
    return endpointUrl;
}

As you may have guessed, the idea here is to not determine the URL at every execution of getEndPointUrl.

The only drawback I've found is the higher memory usage (since the objects are built the first time the function is called and destroyed only when the program ends).

Another thing is that it's considered a "better" practice to have stateless functions, but I don't really think this kind of behaviour can be qualified as a "state".

EDIT: I just wanted to point out that the values I compute are meaningless outside of the function, otherwise they could be a field of the enclosing class or whatever, but they're never used anywhere else.

What are your thoughts?

like image 443
Thomas Kowalski Avatar asked Jun 13 '19 11:06

Thomas Kowalski


4 Answers

Yes, absolutely!

As ever, there is a trade-off, which you've already identified.

But this is a completely normal and sensible thing to do. If you don't need to compute these values every time, then don't.

In this particular case, I'd probably make these items static members of the encapsulating class, unless you have a strong need to delay instantiation (perhaps the functions are not invoked on every run and you deem these initialisations too "heavy" to perform when you don't need them).

In fact, that would render the entire function getEndPointUrl() obsolete. Just have it be a public member constant! Your rationale that the constant "is not used anywhere else" is a bit of a circular argument; you use the data wherever getEndPointUrl() is used.

like image 179
Lightness Races in Orbit Avatar answered Sep 22 '22 23:09

Lightness Races in Orbit


That seems like a viable option. I have one suggested change for your sample, and that is to call getApiUrl() within the second static's initializer, making it the only initializer... thusly:

static const QString endpointUrl = QString("%1/%2").arg(getApiUrl(), "endpoint");

That's one less object to keep around for the lifetime of your program.

There are a number of concerns when caching with statics:

  1. You lose control over when that object's lifetime ends. This may or may not be a problem depending on whether or not that object needs a pointer/reference to something else in order to properly clean up after itself.
  2. I don't believe there are any guarantees about static object deconstruction order... It may be necessary to further wrap things in shared_ptr to ensure resources aren't yanked out from under your static objects.
  3. Caches in general often provide a way to flush out their stored values and be repopulated. No such mechanism exists for statics, particularly const statics.

EDIT: Rumor has it that thread safety is not a concern.

But if none of those concerns apply in your particular use case, then by all means, use static.

EDIT: Non-trivial comment reply:

I cannot advise strongly enough against depending on static object destruction order.

Imaging changing your program such that your resource loading system now starts before your logging system. You set a break point and step through the new code, whereupon you see that everything is fine. You end the debug session, run your unit tests, and they all pass. Check in the source, and the nightly integration tests fail... if you're lucky. If you're not, your program starts to crash on exit in front of customers.

It turns out your resource system tries to log something on shut down. Boom. But hey... everything still runs fine! Why bother fixing it, right?

Oh, and it turns out that thing your resource system was trying to log was an error condition that will only be a problem... for your biggest customer.

Ouch.

Don't be that guy. Don't depend on static destructor order.

(Okay, now imagine that the order of object construction/destruction isn't deterministic because some of those functions with static objects are called from different threads. Having nightmares yet?)

like image 45
Mark Storer Avatar answered Sep 18 '22 23:09

Mark Storer


For your specific example

In your example caching will make basically no performance difference, so using static is probably the only method that's worth it's effort.

If your calculation was actually expensive, here are some thoughts on this way of caching in general:

In General

Pros:

  • It's automatically thread-safe
  • It's very easy to use

Cons:

  • Maintenence
    1. Effectively a singleton with all of it's problems
      • Might become a surprising bug if suddenly getUrl() needs to return different values
      • Not great for unit tests
      • etc.
    2. A lot of people don't know what the static keyword does
    3. (If you link against a static lib multiple times, you will get multiple instances of the variable. Probably a ODR violation)
  • Function
    • Maybe it is undesired for the first call to the function to take longer
  • Structural
    • Likely use after free if the cached object references something on the stack.

^ (Note that a lot of these don't apply in a certain circumstances)

Alternatives

In order of preference:

  1. Cache the value whenever the value of getUrl() is set/changed.
  2. Cache the value in the object
  3. Use a cache (some map<string, shared_ptr<void>> or something) that you inject (Overkill for this)
like image 23
Benno Straub Avatar answered Sep 22 '22 23:09

Benno Straub


It depends on your context. As you rightly point out you're trading memory usage for speed. So what's more important (if either even) in your context? Are you on the hot path of a low latency program? Are you on a low memory device?

like image 41
Paul Evans Avatar answered Sep 19 '22 23:09

Paul Evans