Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Raise compile-time error if a string has whitespace

I have a base class that is intended to be inherited by other users of the code I'm writing, and one of the abstract functions returns a name for the object. Due to the nature of the project that name cannot contain whitespace.

class MyBaseClass {

  public:

    // Return a name for this object. This should not include whitespace.
    virtual const char* Name() = 0;

};

Is there a way to check at compile-time if the result of the Name() function contains whitespace? I know compile-time operations are possible with constexpr functions but I'm not sure of the right way to signal to code users that their function returns a naughty string.

I'm also unclear on how to get a constexpr function to actually be executed by the compiler to perform such a check (if constexpr is even the way to go with this).

like image 818
Michael Hoffmann Avatar asked Jun 10 '21 17:06

Michael Hoffmann


1 Answers

I think this is possible in C++20.

Here is my attempt:

#include <string_view>
#include <algorithm>
#include <stdexcept>

constexpr bool is_whitespace(char c) {
    // Include your whitespaces here. The example contains the characters
    // documented by https://en.cppreference.com/w/cpp/string/wide/iswspace
    constexpr char matches[] = { ' ', '\n', '\r', '\f', '\v', '\t' };
    return std::any_of(std::begin(matches), std::end(matches), [c](char c0) { return c == c0; });
}

struct no_ws {
    consteval no_ws(const char* str) : data(str) {
        std::string_view sv(str);
        if (std::any_of(sv.begin(), sv.end(), is_whitespace)) {
            throw std::logic_error("string cannot contain whitespace");
        }
    }
    const char* data;
};

class MyBaseClass {
  public:
    // Return a name for this object. This should not include whitespace.
    constexpr const char* Name() { return internal_name().data; }
  private:
    constexpr virtual no_ws internal_name() = 0;
};

class Dog : public MyBaseClass {
    constexpr no_ws internal_name() override {
        return "Dog";
    }
};

class Cat : public MyBaseClass {
    constexpr no_ws internal_name() override {
        return "Cat";
    }
};

class BadCat : public MyBaseClass {
    constexpr no_ws internal_name() override {
        return "Bad cat";
    }
};

There are several ideas at play here:

  • Let's use the type system as documentation as well as constraint. Therefore, let us create a class (no_ws in the above example) that represents a string without whitespaces.

  • For the type to enforce the constraints at compile-time, it must evaluate its constructor at compile time. So let's make the constructor consteval.

  • To ensure that derived classes don't break the contract, modify the virtual method to return no_ws.

  • If you want to keep the interface (i.e returning const char*), make the virtual method private, and call it in a public non-virtual method. The technique is explained here.

Now of course here I am only checking a finite set of whitespace characters and is locale-independent. I think it would very tricky to handle locales at compile-time, so maybe a better way (engineering-wise) would be to explicitly specify a set of ASCII characters allowed in the names (a whitelist instead of a blacklist).

The above example would not compile, since "Bad cat" contains whitespace. Commenting out the Bad cat class would allow the code to compile.

Live demo on Compiler Explorer

like image 96
ph3rin Avatar answered Sep 20 '22 23:09

ph3rin