Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Consecutive separators are ignored by BOOST / tokenizer

Tags:

c++

boost

I am using BOOST / tokenizer to split a string. It works fine for strings like "1,2,3", but when there are two or more consecutive separators, for example "1,,3,4", it returns "1", "3", "4".

Is there a way to tokenizer returns an empty string "" instead of skip it?

like image 477
Dabiel Kabuto Avatar asked Dec 25 '22 12:12

Dabiel Kabuto


1 Answers

Boost.Tokenizer's char_separator class provides the option to output an empty token or to skip ahead with its empty_tokens parameter. It defaults to boost::drop_empty_tokens, matching the behavior of strtok(), but can be told to output empty tokens by providing boost::keep_empty_tokens.

For example, with the following program:

#include <iostream>
#include <string>
#include <boost/foreach.hpp>
#include <boost/tokenizer.hpp>

int main()
{
  std::string str = "1,,3,4";
  typedef boost::tokenizer<boost::char_separator<char> > tokenizer;
  boost::char_separator<char> sep(
      ",", // dropped delimiters
      "",  // keep delimiters
      boost::keep_empty_tokens); // empty token policy

  BOOST_FOREACH(std::string token, tokenizer(str, sep))
  {
    std::cout << "<" << token << "> ";
  }
  std::cout << std::endl;
}

The output is:

<1> <> <3> <4> 
like image 111
Tanner Sansbury Avatar answered Dec 29 '22 02:12

Tanner Sansbury