Passing non-PODs to variable argument functions such as printf is undefined behaviour (1, 2), but I don't understand why the C++ standard was set this way. Is there anything inherent in variable arg functions that prevents them from accepting classes as arguments?
The variable-arg callee indeed knows nothing about their type - but nor does it know anything about built-in types or plain PODs it accepts.
Also, these are necessarily cdecl functions, so the caller can be responsible e.g. for copying them upon passing and destroying them on return.
Any insight would be appreciated.
EDIT: I still see no reason why the suggested variadic semantics won't work, but zneak's answer demonstrates well what it would take to adjust compilers to it - so I accepted it. Ultimately, it might be some historical glitch.
The calling convention does specify who does the low-level stack dance, but it doesn't say who's responsible for "high-level" C++ bookkeeping. At least on Windows, a function that accepts an object by value is responsible for calling its destructor, even though it is not responsible for the storage space. For instance, if you build this:
#include <stdio.h> struct Foo { Foo() { puts("created"); } Foo(const Foo&) { puts("copied"); } ~Foo() { puts("destroyed"); } }; void __cdecl x(Foo f) { } int main() { Foo f; x(f); return 0; }
you get:
x: mov qword ptr [rsp+8],rcx sub rsp,28h mov rcx,qword ptr [rsp+30h] call module!Foo::~Foo (00000001`400027e0) add rsp,28h ret main: sub rsp,48h mov qword ptr [rsp+38h],0FFFFFFFFFFFFFFFEh lea rcx,[rsp+20h] call module!Foo::Foo (00000001`400027b0) # default ctor nop lea rax,[rsp+21h] mov qword ptr [rsp+28h],rax lea rdx,[rsp+20h] mov rcx,qword ptr [rsp+28h] call module!Foo::Foo (00000001`40002780) # copy ctor mov qword ptr [rsp+30h],rax mov rcx,qword ptr [rsp+30h] call module!x (00000001`40002810) mov dword ptr [rsp+24h],0 lea rcx,[rsp+20h] call module!Foo::~Foo (00000001`400027e0) mov eax,dword ptr [rsp+24h] add rsp,48h ret
Notice how main
constructs two Foo
objects but destroys only one; x
takes care of the other one. That obviously wouldn't work if the object was passed as a vararg.
EDIT: Another problem with passing objects to functions with variadic parameters is that in its current form, regardless of the calling convention, the "right thing" requires two copies, whereas normal parameter passing requires just one. Unless C++ extended C variadic functions by making it possible to pass and/or accept references to objects (which is extremely unlikely to ever happen, given that C++ solves the same problem in a type-safe way using variadic templates), the caller needs to make one copy of the object, and va_arg
only allows the callee to get a copy of that copy.
Microsoft's CL tries to get away with one bitwise copy and one full copy construction of that bitwise copy at the va_arg
site, but it can have nasty consequences. Consider this example:
struct foo { char* ptr; foo(const char* ptr) { this->ptr = _strdup(ptr); } foo(const foo& that) { ptr = _strdup(that.ptr); } ~foo() { free(ptr); } void setPtr(const char* ptr) { free(this->ptr); this->ptr = _strdup(ptr); } }; void variadic(foo& a, ...) { a.setPtr("bar"); va_list list; va_start(list, a); foo b = va_arg(list, foo); va_end(list); printf("%s %s\n", a.ptr, b.ptr); } int main() { foo f = "foo"; variadic(f, f); }
On my machine, this prints "bar bar", even though it would print "foo bar" if I had a non-variadic function whose second parameter accepted another foo
by copy. This is because a bitwise copy of f
happens in main
at the call site of variadic
, but the copy constructor is only invoked when va_arg
is called. Between the two, a.setPtr
invalidates the original f.ptr
value, which is however still present in the bitwise copy, and by pure coincidence _strdup
returns that same pointer (albeit with a new string inside). Another outcome of the same code could be a crash in _strdup
.
Note that this design works great for POD types; it only falls apart when constructors and destructors need side effects.
The original point that calling conventions and parameter passing mechanisms don't necessarily support non-trivial construction and destruction of objects still stands: this is exactly what happens here.
EDIT: answer originally said that the construction and destruction behavior was specific to cdecl; it is not. (Thanks Cody!)
I'm recording this, because it's too big to be a comment, and it was reasonably time consuming to hunt this down, so no one else wastes time looking down this route.
The text was first changed to something similar to the current wording in the draft standard in N2134 released 2006-11-03.
With some effort, I was able to trace back the wording to DR506.
Paper J16/04-0167=WG21 N1727 suggests that passing a non-POD object to ellipsis be ill-formed. In discussions at the Lillehammer meeting, however, the CWG felt that the newly-approved category of conditionally-supported behavior would be more appropriate.
The paper referenced (N1727), says very little on the subject:
The existing wording (5.2.2¶7) makes it undefined behavior to pass a non-POD object to an ellipsis in a function call:
{Snip}
Once again, the CWG saw no reason not to require implementations to issue a diagnostic in such cases.
However, this doesn't tell me very much about why it was the way it was to begin with, which is what you want to know. Turning the clock back further to when that language was first written is not possible for me, because the oldest freely available draft standard is from 2005 and already has the wording you're wondering about, all standards prior to this either require authentication or are simply contentless.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With