I wrote a very simple template to tokenize a string as shown below.
However, I have a problem calling that function, I cannot use a C string for the delimiters
or trim_string
arguments. These have to be std::string
(or whatever type of string StringT
is, i.e. std::wstring
).
So the following fails:
std::vector<std::string> tokens;
std::string str = "This string, it will be split, in 3.";
int count = tokenize_string(tokens, str, ",", true, " ");
To fix the problem I have to write:
std::vector<std::string> tokens;
std::string str = "This string, it will be split, in 3.";
int count = tokenize_string(tokens, str,
std::string(","), true, std::string(" "));
Is there a way to avoid having to use std::string() around the standard C strings in such a situation?
The errors I get with g++ looks like this:
/home/snapwebsites/snapwebsites/snapmanagercgi/daemon/snapmanagerdaemon.cpp: In member function ‘void snap_manager::manager_daemon::init(int, char**)’:
/home/snapwebsites/snapwebsites/snapmanagercgi/daemon/snapmanagerdaemon.cpp:103:71: error: no matching function for call to ‘tokenize_string(std::vector<std::__cxx11::basic_string<char> >&, const string&, const char [2], bool, const char [2])’
snap::tokenize_string(f_bundle_uri, bundle_uri, ",", true, " ");
^
In file included from /home/snapwebsites/snapwebsites/snapmanagercgi/daemon/snapmanagerdaemon.cpp:35:0:
/home/snapwebsites/BUILD/dist/include/snapwebsites/tokenize_string.h:46:8: note: candidate: template<class StringT, class ContainerT> size_t snap::tokenize_string(ContainerT&, const StringT&, const StringT&, bool, const StringT&)
size_t tokenize_string(ContainerT & tokens
^
/home/snapwebsites/BUILD/dist/include/snapwebsites/tokenize_string.h:46:8: note: template argument deduction/substitution failed:
/home/snapwebsites/snapwebsites/snapmanagercgi/daemon/snapmanagerdaemon.cpp:103:71: note: deduced conflicting types for parameter ‘const StringT’ (‘std::__cxx11::basic_string<char>’ and ‘char [2]’)
snap::tokenize_string(f_bundle_uri, bundle_uri, ",", true, " ");
^
The template:
template < class StringT, class ContainerT >
size_t tokenize_string(ContainerT & tokens
, StringT const & str
, StringT const & delimiters
, bool const trim_empty = false
, StringT const & trim_string = StringT())
{
for(typename StringT::size_type pos(0), last_pos(0); last_pos < str.length(); last_pos = pos + 1)
{
pos = str.find_first_of(delimiters, last_pos);
// no more delimiters?
//
if(pos == StringT::npos)
{
pos = str.length();
}
char const * start(str.data() + last_pos);
char const * end(start + (pos - last_pos));
if(start != end // if not (already) empty
&& !trim_string.empty()) // and there are characters to trim
{
// find first character not in trim_string
//
start = std::find_if_not(
start
, end
, [&trim_string](auto const c)
{
return trim_string.find(c) != StringT::npos;
});
// find last character not in trim_string
//
if(start < end)
{
reverse_cstring<typename StringT::value_type const> const rstr(start, end);
auto p = std::find_if_not(
rstr.begin()
, rstr.end()
, [&trim_string](auto const c)
{
return trim_string.rfind(c) != StringT::npos;
});
end = p.get();
}
}
if(start != end // if not empty
|| !trim_empty) // or user accepts empty
{
tokens.push_back(typename ContainerT::value_type(start, end - start));
}
}
return tokens.size();
}
The rule is that when you have three StringT const &
parameters, StringT
is deduced independently from the corresponding arguments, and the deduced type must match.
You can
typename ContainerT::value_type
for all three, if the container's value type is expected to be the correct string type; orBlock deduction of StringT
from two of the three StringT
-taking parameters,
Either at the call site by making the later two arguments non-deduced contexts with braced-init-lists:
int count = tokenize_string(tokens, str, {","}, true, {" "});
Or in the function template itself, by wrapping the latter two StringT
parameters into a non-deduced context:
template < class StringT, class ContainerT >
size_t tokenize_string(ContainerT & tokens
, StringT const & str
, typename std::decay<StringT>::type const & delimiters
, bool const trim_empty = false
, typename std::decay<StringT>::type const & trim_string = StringT())
Or take different type parameters for each and harmonize them later in the function template body.
You can use string literals (as suggested by Paul Stelian and others) or you can explicit the first template argument calling tokenize_string()
.
By example, in this way
int count = tokenize_string<std::string>(tokens, str, ",", true, " ");
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With