Is there a convenient way in boost.regex to switch between ascii and utf?
The only way I see right now is to, for example, switch between boost::u32regex to boost::regex.
Is this the only way to switch between unicode and ascii?
I was hoping to be able to just pass a parameter to boost, specifying my character encoding, thus-by not have to duplicate a lot of code.
Is this the only way to switch between unicode and ascii?
Pretty much. What you think of as boost::regex is really a type alias:
namespace boost{
template <class charT, class traits = regex_traits<charT> >
class basic_regex;
typedef basic_regex<char> regex;
typedef basic_regex<wchar_t> wregex;
}
Note that the character type is a template parameter - it's not a runtime parameter. Since boost::regex is built on char, it cannot support unicode.
boost::u32regex is the same way:
typedef basic_regex<UChar32,icu_regex_traits> u32regex;
In order to really generalize between them, you would have to write everything as a template too. Instead of taking a boost::regex, you take a boost::basic_regex<charT, traits>. That's one of the downsides of templates - they kind of just permeate everything.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With