Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dealing with char buffers

Tags:

c++

c

As a C++ programmer I sometimes need deal with memory buffers using techniques from C. For example:

char buffer[512];
sprintf(buffer, "Hello %s!", userName.c_str());

Or in Windows:

TCHAR buffer[MAX_PATH+1]; // edit: +1 added
::GetCurrentDirectory(sizeof(buffer)/sizeof(TCHAR), &buffer[0]);

The above sample is how I usually create local buffers (a local stack-allocated char array). However, there are many possible variations and so I'm very interested in your answers to the following questions:

  • Is passing the buffer as &buffer[0] better programming style than passing buffer? (I prefer &buffer[0].)
  • Is there a maximum size that is considered safe for stack allocated buffers?
    • Update: I mean, for example, the highest value that can be considered safe for cross-platform desktop applications on Mac, Windows, Linux desktops (not mobile!).
  • Is a static buffer (static char buffer[N];) faster? Are there any other arguments for or against it?
  • When using static buffers you can use return type const char *. Is this (generally) a good or a bad idea? (I do realize that the caller will need to make his own copy to avoid that the next call would change the previous return value.)
  • What about using static char * buffer = new char[N]; , never deleting the buffer and reusing it on each call.
  • I understand that heap allocation should be used when (1) dealing with large buffers or (2) maximum buffer size is unknown at compile time. Are there any other factors that play in the stack/heap allocation decision?
  • Should you prefer the sprintf_s, memcpy_s, ... variants? (Visual Studio has been trying to convince me of this for a long time, but I want a second opinion :p )
like image 264
StackedCrooked Avatar asked Jun 24 '10 19:06

StackedCrooked


2 Answers

  • It's up to you, just doing buffer is more terse but if it were a vector, you'd need to do &buffer[0] anyway.
  • Depends on your intended platform.
  • Does it matter? Have you determined it to be a problem? Write the code that's easiest to read and maintain before you go off worrying if you can obfuscate it into something faster. But for what it's worth, allocation on the stack is very fast (you just change the stack pointer value.)
  • You should be using std::string. If performance becomes a problem, you'd be able to reduce dynamic allocations by just returning the internal buffer. But the std::string return interface is way nicer and safer, and performance is your last concern.
  • That's a memory leak. Many will argue that's okay, since the OS free's it anyway, but I feel it terrible practice to just leak things. Use a static std::vector, you should never be doing any raw allocation! If you're putting yourself into a position where you might leak (because it needs to be done explicitly), you're doing it wrong.
  • I think your (1) and (2) just about cover it. Dynamic allocation is almost always slower than stack allocation, but you should be more concerned about which makes sense in your situation.
  • You shouldn't be using those at all. Use std::string, std::stringstream, std::copy, etc.
like image 105
GManNickG Avatar answered Oct 30 '22 03:10

GManNickG


I assume your interest comes about primarily from a performance perspective, since solutions like vector, string, wstring, etc. will generally work even for interacting with C APIs. I recommend learning how to use those and how to use them efficiently. If you really need it, you can even write your own memory allocator to make them super fast. If you are sure they're not what you need, there's still no excuse for you to not write a simple wrapper to handle these string buffers with RAII for the dynamic cases.

With that out of the way:

Is passing the buffer as &buffer[0] better programming style than passing buffer? (I prefer &buffer[0].)

No. I would consider this style to be slightly less useful (admittedly being subjective here) as you cannot use it to pass a null buffer and therefore would have to make exceptions to your style to pass pointers to arrays that can be null. It is required if you pass in data from std::vector to a C API expecting a pointer, however.

Is there a maximum size that is considered safe for stack allocated buffers?

This depends on your platform and compiler settings. Simple rule of thumb: if you're in doubt about whether your code will overflow the stack, write it in a way which can't.

Is a static buffer (static char buffer[N];) faster? Are there any other arguments for or against it?

Yes, there is a big argument against it, and that is that it makes your function no longer re-entrant. If your application becomes multithreaded, these functions will not be thread safe. Even in a single-threaded application, sharing the same buffer when these functions are recursively called can lead to problems.

What about using static char * buffer = new char[N]; and never deleting the buffer? (Reusing the same buffer each call.)

We still have the same problems with re-entrancy.

I understand that heap allocation should be used when (1) dealing with large buffers or (2) maximum buffer size is unknown at compile time. Are there any other factors that play in the stack/heap allocation decision?

Stack unwinding destroys objects on the stack. This is especially important for exception-safety. Thus even if you allocate memory on the heap within a function, it should generally be managed by an object on the stack (ex: smart pointer). ///@see RAII.

Should you prefer the sprintf_s, memcpy_s, ... variants? (Visual Studio has been trying to convince me of this for a long time, but I want a second opinion :p )

MS was right about these functions being safer alternatives since they don't have buffer overflow problems, but if you write such code just as is (without writing variants for other platforms), your code will be married to Microsoft since it will be non-portable.

When using static buffers you can use return type const char *. Is this (generally) a good or a bad idea? (I do realize that the caller will need to make his own copy to avoid that the next call would change the previous return value.)

I'd say in almost every case, you want to use const char* for return types for a function returning a pointer to a character buffer. For a function to return a mutable char* is generally confusing and problematic. Either it's returning an address to global/static data which it shouldn't be using in the first place (see re-entrancy above), local data of a class (if it's a method) in which case returning it ruins the class's ability to maintain invariants by allowing clients to tamper with it however they like (ex: stored string must always be valid), or returning memory that was specified by a pointer passed in to the function (the only case where one might reasonably argue that mutable char* should be returned).

like image 44
stinky472 Avatar answered Oct 30 '22 04:10

stinky472