Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

boost::program_options: parameters with a fixed and a variable token?

Is it possible to use parameters of this kind with boost::program_options?

program  -p1 123 -p2 234 -p3 345 -p12 678

i.e., is it possible to specify the parameter name with a first token (e.g. -p) followed by a number, dynamically?
I would like to avoid this:

program  -p 1 123 -p 2 234 -p 3 345 -p 12 678
like image 204
Pietro Avatar asked Mar 21 '13 18:03

Pietro


1 Answers

Boost.ProgramOptions does not provide direct support for this. Nevertheless, there are two general solutions that each have their trade-offs:

  • Wildcard options.
  • Custom parser.

Wildcard Options

If it is an acceptable to use --p instead of -p, then a wildcard option can be used. This requires iterating through the variables_map during extraction, as Boost.ProgramOptions does not provide support receiving both the key and value in an overloaded validate() function.

#include <iostream>
#include <map>
#include <string>
#include <utility>
#include <vector>

#include <boost/algorithm/string.hpp>
#include <boost/foreach.hpp>
#include <boost/lexical_cast.hpp>
#include <boost/program_options.hpp>

typedef std::map<int, int> p_options_type;

/// @brief Extract options from variable map with the a key of
///        <prefix>#*.
p_options_type get_p_options(
  const boost::program_options::variables_map& vm,
  const std::string prefix)
{
  p_options_type p_options;

  const std::size_t prefix_size = prefix.size();

  boost::iterator_range<std::string::const_iterator> range;
  namespace po = boost::program_options;
  BOOST_FOREACH(const po::variables_map::value_type& pair, vm)
  {
    const std::string& key = pair.first;

    // Key is too small to contain prefix and a value, continue to next.
    if (key.size() < (1 + prefix.size())) continue;

    // Create range that partitions key into two parts.  Given a key
    // of "p12" the resulting partitions would be:
    //
    //     ,--------- key.begin           -., prefix = "p"
    //    / ,-------- result.begin        -:, post-prefix = "12"
    //   / /   ,----- key.end, result.end -'
    //  |p|1|2|
    range = boost::make_iterator_range(key.begin() + prefix_size,
                                       key.end());

    // Iterate to next key if the key:
    // - does not start with prefix
    // - contains a non-digit after prefix
    if (!boost::starts_with(key, prefix) || 
        !boost::all(range, boost::is_digit()))
      continue;

    // Create pair and insert into map.
    p_options.insert(
      std::make_pair(
        boost::lexical_cast<int>(boost::copy_range<std::string>(range)),
        pair.second.as<int>())); 
  }
  return p_options;
}

int main(int ac, char* av[])
{
  namespace po = boost::program_options;
  po::options_description desc;
  desc.add_options()
    ("p*", po::value<int>())
    ;

  po::variables_map vm;
  store(po::command_line_parser(ac, av).options(desc).run(), vm);

  BOOST_FOREACH(const p_options_type::value_type& p, get_p_options(vm, "p"))
  {
    std::cout << "p" << p.first << "=" << p.second << std::endl;
  }
}

And its usage:

./a.out --p1 123 --p2 234 --p3=345 --p12=678
p1=123
p2=234
p3=345
p12=678

This approach requires iterating over the entire map to identify wildcard matches, resulting in a complexity of O(n). Additionally, it requires a modification to the desired syntax, where --p1 123 needs to be use instead of -p1 123. This limitation is the result Boost.ProgramOptions's default parser behavior, where a single hyphen is expected to be followed by a single character.


Custom Parser

The alternative approach is to add a custom parser to the command_line_parser. A custom parser will allow -p1 syntax, as well as other common forms, such as --p1 123 and -p1=123. There are a few behaviors that need to be handled:

  • A parser will receive a single token at a time. Thus, it will receive p1 and 123 on individual invocations. It is the parsers responsibility to pair p1 to 123.
  • Boost.ProgramOptions expects at least one parser to handle a token. Otherwise boost::program_options::unknown_option will be thrown.

To account for these behaviors, the custom parser will manage state and perform encoding/decoding:

  • When the parser receives p1, it extracts 1, storing state in the parser. Additionally, it encodes a no operation value for p.
  • When the parser receives 123, it encodes it alongside the stored state as value for p.

Thus, if the parser receives -p1 and 123, 2 values are inserted into the variables_map for p: the no operation value and 1:123.

{ "p" : [ "no operation",
          "1:123" ] }

This encoding can be transparent to the user by providing a helper function to transform the encoded p vector into a map. The result of decoding would be:

{ 1 : 123 }

Here is the example code:

#include <iostream>
#include <map>
#include <string>
#include <utility> // std::pair, std::make_pair
#include <vector>

#include <boost/algorithm/string.hpp>
#include <boost/foreach.hpp>
#include <boost/lexical_cast.hpp>
#include <boost/program_options.hpp>

typedef std::map<int, int> p_options_type;

/// @brief Parser that provides the ability to parse "-p# #" options.
///
/// @note The keys and values are passed in separately to the parser.
///       Thus, the struct must be stateful.
class p_parser
{
public:

  explicit
  p_parser(const std::string& prefix)
    : prefix_(prefix),
      hyphen_prefix_("-" + prefix)
  {}

  std::pair<std::string, std::string> operator()(const std::string& token)
  {
    // To support "-p#=#" syntax, split the token.
    std::vector<std::string> tokens(2);
    boost::split(tokens, token, boost::is_any_of("="));

    // If the split resulted in two tokens, then key and value were
    // provided as a single token.
    if (tokens.size() == 2)
      parse(tokens.front()); // Parse key.

    // Parse remaining token.
    // - If tokens.size() == 2, then the token is the value.
    // - Otherwise, it is a key.
    return parse(tokens.back());
  }

  /// @brief Decode a single encoded value.
  static p_options_type::value_type decode(const std::string& encoded)
  {
    // Decode.
    std::vector<std::string> decoded(field_count_);
    boost::split(decoded, encoded, boost::is_any_of(delimiter_));

    // If size is not equal to the field count, then encoding failed.
    if (field_count_ != decoded.size())
      throw boost::program_options::invalid_option_value(encoded);

    // Transform.
    return std::make_pair(boost::lexical_cast<int>(decoded[0]),
                          boost::lexical_cast<int>(decoded[1]));
  }

  /// @brief Decode multiple encoded values.
  static p_options_type decode(const std::vector<std::string>& encoded_values)
  {
    p_options_type p_options;
    BOOST_FOREACH(const std::string& encoded, encoded_values)
    {
      // If value is a no-op, then continue to next.
      if (boost::equals(encoded, noop_)) continue;
      p_options.insert(decode(encoded));
    }
    return p_options;
  }

private:

  std::pair<std::string, std::string> parse(const std::string& token)
  {
    return key_.empty() ? parse_key(token)
                        : parse_value(token);
  }

  /// @brief Parse key portion of option: "p#"
  std::pair<std::string, std::string> parse_key(const std::string& key)
  {
    // Search for the prefix to obtain a range that partitions the key into
    // three parts.  Given --p12, the partitions are:
    //
    //      ,--------- key.begin    -., pre-prefix   = "-"
    //     / ,-------- result.begin -:, prefix       = "-p"
    //    / /   ,----- result.end   -:, post-prefix  = "12"
    //   / /   /   ,-- key.end      -'
    //  |-|-|p|1|2|
    //
    boost::iterator_range<std::string::const_iterator> result =
      boost::find_first(key, prefix_);

    // Do not handle the key if:
    // - Key end is the same as the result end.  This occurs when either
    //   either key not found or nothing exists beyond the key (--a or --p)
    // - The distance from start to prefix start is greater than 2 (---p)
    // - Non-hyphens exists before prefix (a--p)
    // - Non-numeric values are after result.
    if (result.end() == key.end() ||
        distance(key.begin(), result.begin()) > 2 ||
        !boost::all(
          boost::make_iterator_range(key.begin(), result.begin()),
          boost::is_any_of("-")) ||
        !boost::all(
          boost::make_iterator_range(result.end(), key.end()),
          boost::is_digit()))
    {
      // A different parser will handle this token.
      return make_pair(std::string(), std::string());
    }

    // Otherwise, key contains expected format.
    key_.assign(result.end(), key.end());

    // Return non-empty pair, otherwise Boost.ProgramOptions will
    // consume treat the next value as the complete value.  The
    // noop entries will be stripped in the decoding process.
    return make_pair(prefix_, noop_);
  }

  /// @brief Parse value portion of option: "#"
  std::pair<std::string, std::string> parse_value(const std::string& value)
  {
    std::pair<std::string, std::string> encoded =
      make_pair(prefix_, key_ + delimiter_ + value);
    key_.clear();
    return encoded;
  }

private:
  static const int field_count_ = 2;
  static const std::string delimiter_;
  static const std::string noop_;
private:
  const std::string prefix_;
  const std::string hyphen_prefix_;
  std::string key_;
};

const std::string p_parser::delimiter_ = ":";
const std::string p_parser::noop_      = "noop";

/// @brief Extract and decode options from variable map.
p_options_type get_p_options(
  const boost::program_options::variables_map& vm,
  const std::string prefix)
{
  return p_parser::decode(vm[prefix].as<std::vector<std::string> >());
}

int main(int ac, char* av[])
{
  const char* p_prefix = "p";
  namespace po = boost::program_options;

  // Define options.
  po::options_description desc;
  desc.add_options()
    (p_prefix, po::value<std::vector<std::string> >()->multitoken())
    ;

  po::variables_map vm;
  store(po::command_line_parser(ac, av).options(desc)
          .extra_parser(p_parser(p_prefix)).run()
       , vm);

  // Extract -p options. 
  if (vm.count(p_prefix))
  {
    // Print -p options.
    BOOST_FOREACH(const p_options_type::value_type& p,
                  get_p_options(vm, p_prefix))
    {
      std::cout << "p" << p.first << "=" << p.second << std::endl;
    }
  }
}

And its usage:

./a.out -p1 123 --p2 234 -p3=345 --p12=678
p1=123
p2=234
p3=345
p12=678

Aside from the being a larger solution, one drawback is the requirement to go through the decoding process to obtain the desired values. One cannot simply iterate over the results of vm["p"] in a meaningful way.

like image 126
Tanner Sansbury Avatar answered Oct 27 '22 17:10

Tanner Sansbury