Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A string tokenizer in C++ that allows multiple separators

Is there a way to tokenize a string in C++ with multiple separators? In C# I would have done:

string[] tokens = "adsl, dkks; dk".Split(new [] { ",", " ", ";" }, StringSplitOptions.RemoveEmpty);
like image 501
Hao Wooi Lim Avatar asked Apr 16 '10 15:04

Hao Wooi Lim


2 Answers

Use boost::tokenizer. It supports multiple separators.

In fact, you don't really even need boost::tokenizer. If all you want is a split, use boost::split. The documentation has an example: http://www.boost.org/doc/libs/1_42_0/doc/html/string_algo/usage.html#id1718906

like image 95
frankc Avatar answered Sep 25 '22 05:09

frankc


Something like that will do:

void tokenize_string(const std::string &original_string, const std::string &delimiters, std::vector<std::string> *tokens)
{
        if (NULL == tokens) return;

        size_t pos_start = original_string.find_first_not_of(delimiters);
        size_t pos_end   = original_string.find_first_of(delimiters, pos_start);

        while (std::string::npos != pos_start)
        {
                tokens->push_back(original_string.substr(pos_start, pos_end - pos_start));
                pos_start = original_string.find_first_not_of(delimiters, pos_end);
                pos_end   = original_string.find_first_of(delimiters, pos_start);
        }
}
like image 41
Dmitry Avatar answered Sep 26 '22 05:09

Dmitry