Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parse comma-separated ints/int-ranges in C++

Given a string in C++ containing ranges and single numbers of the kind:

"2,3,4,7-9"

I want to parse it into a vector of the form:

2,3,4,7,8,9

If the numbers are separated by a - then I want to push all of the numbers in the range. Otherwise I want to push a single number.

I tried using this piece of code:

const char *NumX = "2,3,4-7";
std::vector<int> inputs;
std::istringstream in( NumX );
std::copy( std::istream_iterator<int>( in ), std::istream_iterator<int>(),
           std::back_inserter( inputs ) );

The problem was that it did not work for the ranges. It only took the numbers in the string, not all of the numbers in the range.

like image 894
Efrat.shp Avatar asked Dec 18 '22 12:12

Efrat.shp


2 Answers

Your problem consists of two separate problems:

  1. splitting the string into multiple strings at ,
  2. adding either numbers or ranges of numbers to a vector when parsing each string

If you first split the whole string at a comma, you won't have to worry about splitting it at a hyphen at the same time. This is what you would call a Divide-and-Conquer approach.

Splitting at ,

This question should tell you how you can split the string at a comma.

Parsing and Adding to std::vector<int>

Once you have the split the string at a comma, you just need to turn ranges into individual numbers by calling this function for each string:

#include <vector>
#include <string>

void push_range_or_number(const std::string &str, std::vector<int> &out) {
    size_t hyphen_index;
    // stoi will store the index of the first non-digit in hyphen_index.
    int first = std::stoi(str, &hyphen_index);
    out.push_back(first);

    // If the hyphen_index is the equal to the length of the string,
    // there is no other number.
    // Otherwise, we parse the second number here:
    if (hyphen_index != str.size()) {
        int second = std::stoi(str.substr(hyphen_index + 1), &hyphen_index);
        for (int i = first + 1; i <= second; ++i) {
            out.push_back(i);
        }
    }
}

Note that splitting at a hyphen is much simpler because we know there can be at most one hyphen in the string. std::string::substr is the easiest way of doing it in this case. Be aware that std::stoi can throw an exception if the integer is too large to fit into an int.

like image 83
Jan Schultke Avatar answered Dec 27 '22 10:12

Jan Schultke


All very nice solutions so far. Using modern C++ and regex, you can do an all-in-one solution with only very few lines of code.

How? First, we define a regex that either matches an integer OR an integer range. It will look like this

((\d+)-(\d+))|(\d+)

Really very simple. First the range. So, some digits, followed by a hyphen and some more digits. Then the plain integer: Some digits. All digits are put in groups. (braces). The hyphen is not in a matching group.

This is all so easy that no further explanation is needed.

Then we call std::regex_search in a loop, until all matches are found.

For each match, we check, if there are sub-matches, meaning a range. If we have sub-matches, a range, then we add the values between the sub-matches (inclusive) to the resulting std::vector.

If we have just a plain integer, then we add only this value.

All this gives a very simple and easy to understand program:

#include <iostream>
#include <string>
#include <vector>
#include <regex>

const std::string test{ "2,3,4,7-9" };

const std::regex re{ R"(((\d+)-(\d+))|(\d+))" };
std::smatch sm{};

int main() {
    // Here we will store the resulting data
    std::vector<int> data{};

    // Search all occureences of integers OR ranges
    for (std::string s{ test }; std::regex_search(s, sm, re); s = sm.suffix()) {

        // We found something. Was it a range?
        if (sm[1].str().length())

            // Yes, range, add all values within to the vector  
            for (int i{ std::stoi(sm[2]) }; i <= std::stoi(sm[3]); ++i) data.push_back(i);
        else
            // No, no range, just a plain integer value. Add it to the vector
            data.push_back(std::stoi(sm[0]));
    }
    // Show result
    for (const int i : data) std::cout << i << '\n';
}

If you should have more questions, I am happy to answer.


Language: C++ 17 Compiled and tested with MS Visual Studio 19 Community Edition

like image 25
Armin Montigny Avatar answered Dec 27 '22 11:12

Armin Montigny