Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

boost serialization NVP macro and non-XML-element characters

When using the BOOST_SERIALIZATION_NVP macro to create a name-value pair for XML serialization, the compiler happily allows the following code to compile, even though the element name is not a valid XML element and an exceptions is thrown when trying to actually serialize the object into XML:

BOOST_SERIALIZATION_NVP(_member[index])

An obvious fix is to use:

boost::serialization::make_nvp("ValidMemberName", _member[index])

But can anyone suggest a way to modify boost so that illegitimate element names would trigger a compilation error? (thus not relying on unit testing to catch the above subtle bug)


Edit:

One idea is to somehow declare a dummy local variable with the name of the element passed to the macro, assuming the set of valid identifiers in C++ is a subset of valid XML elements. Not entire sure this can be done though.

like image 922
Assaf Lavie Avatar asked Oct 14 '22 16:10

Assaf Lavie


2 Answers

Quoting the XML syntax for names:

NameStartChar ::=   ":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
NameChar      ::=   NameStartChar | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]

I see the following differences with C++ names: Leading _ can be reserved in C++, depending on what follows; colons, dots and minus signs are valid in XML names. Unicode characters may also cause some grief in C++, but that's mostly implementation-dependent.

The [#x0300-#x036F] part are combining accents (diacriticals), which could be an additional concern wrt name equality.

So, there's no solution that catches the non-alphabetical characters. <:::/> may look like a smiley, but it's well-formed XML. For the rest, your idea is pretty much OK. Eric Melski's idea is a nice attempt, but it accepts a lot of non-alphabetical characters. For instance, foo[1] , &foo, *foo, foo=0 or foo,bar. There's a better alternative: {using namespace name;}. This will accept foo::bar, but that's actually OK - foo::bar is allowed in XML.

like image 89
MSalters Avatar answered Oct 20 '22 19:10

MSalters


I think your idea will probably work. Valid C++ identifiers are made up of A-Z, a-z, 0-9 and underscore, which is in fact a proper subset of XML identifiers (which add hyphen, period, and a bunch of Unicode characters to the set).

You could try a construct like this to get a compile time error:

#define SAFE_BOOST_SERIALIZATION_NVP(name) \
    { int name = 0; } ; BOOST_SERIALZATION_NVP(name)

The braces limit the scope of the dummy variable to just that line, so you don't clutter your function with bogus variables. Probably the compiler optimizes the dummy variable out too, so there's no runtime cost. When I use this macro in the following code, I get error: invalid intializer:

#include "boost/serialization/nvp.hpp"
#define SAFE_BOOST_SERIALIZATION_NVP(name) \
    { int name = 0; } ; BOOST_SERIALIZATION_NVP(name)

int main(int argc, char *argv[])
{
    int foo[3] = { 10, 20, 30 };
    int bar = 10;
    SAFE_BOOST_SERIALIZATION_NVP(foo[0]);
    return 0;
}

If I replace foo[0] with bar in the call to SAFE_BOOST_SERIALIZATION_NVP, it compiles with no error.

like image 40
Eric Melski Avatar answered Oct 20 '22 17:10

Eric Melski