Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ compile-time substring

Tags:

c++

c++17

c++14

I have very big code-base, which uses __FILE__ extensively for logging. However, it includes full path, which is (1) not needed, (2) might case security violations.

I'm trying to write compile-time sub-string expression. Ended up with this solution

static constexpr cstr PastLastSlash(cstr str, cstr last_slash)
{
    return *str == '\0' ? last_slash : *str == '/' ? PastLastSlash(str + 1, str + 1) : PastLastSlash(str + 1, last_slash);
}

static constexpr cstr PastLastSlash(cstr str)
{
    return PastLastSlash(str, str);
}

// usage
PastLastSlash(__FILE__);

This works good, I've checked assembly code, line is trimmed in compile time, only file name is present in binary.

However, this notation is too verbose. I would like to use macro for this, but failed. Proposed example from the link above

#define __SHORT_FILE__ ({constexpr cstr sf__ {past_last_slash(__FILE__)}; sf__;})

doesn't work for MSVC compiler (I'm using MSVC 2017). Is there any other method do to so using c++17?

UPD1: clang trimmed by function https://godbolt.org/z/tAU4j7

UPD2: looks like it's possible to do trim on compile time using functions, but full string is swill be present in binary.

like image 792
yudjin Avatar asked Jun 06 '19 06:06

yudjin


3 Answers

The idea is to create truncated array of characters, but it needs to use only compile time features. Generating data array through variadic template with pack of char forces compiler to generate data without direct relation to passed string literal. This way compiler cannot use input string literal, especially when this string is long.

Godbolt with clang: https://godbolt.org/z/WdKNjB.

Godbolt with msvc: https://godbolt.org/z/auMEIH.

The only problem is with template depth compiler settings.

First we define int variadic template to store sequence of indexes:

template <int... I>
struct Seq {};

Pushing int to Seq:

template <int V, typename T>
struct Push;

template <int V, int... I>
struct Push<V, Seq<I...>>
{
    using type = Seq<V, I...>;
};

Creating sequence:

template <int From, int To>
struct MakeSeqImpl;

template <int To>
struct MakeSeqImpl<To, To>
{
    using type = Seq<To>;
};

template <int From, int To>
using MakeSeq = typename MakeSeqImpl<From, To>::type;

template <int From, int To>
struct MakeSeqImpl : Push<From, MakeSeq<From + 1, To>> {};

Now we can make sequence of compile time ints, meaning that MakeSeq<3,7> == Seq<3,4,5,6,7>. Still we need something to store selected characters in array, but using compile time representation, which is variadic template parameter with characters:

template<char... CHARS>
struct Chars {
    static constexpr const char value[] = {CHARS...};
};
template<char... CHARS>
constexpr const char Chars<CHARS...>::value[];

Next we something to extract selected characters into Chars type:

template<typename WRAPPER, typename IDXS>
struct LiteralToVariadicCharsImpl;

template<typename WRAPPER, int... IDXS>
struct LiteralToVariadicCharsImpl<WRAPPER, Seq<IDXS...> > {
    using type = Chars<WRAPPER::get()[IDXS]...>;
};

template<typename WRAPPER, typename SEQ>
struct LiteralToVariadicChars {
    using type = typename LiteralToVariadicCharsImpl<WRAPPER, SEQ> :: type;
};

WRAPPER is a type that contain our string literal.

Almost done. The missing part is to find last slash. We can use modified version of the code found in the question, but this time it returns offset instead of pointer:

static constexpr int PastLastOffset(int last_offset, int cur, const char * const str)
{
    if (*str == '\0') return last_offset;
    if (*str == '/') return PastLastOffset(cur + 1, cur + 1, str + 1);
    return PastLastOffset(last_offset, cur + 1, str + 1);
}

Last util to get string size:

constexpr int StrLen(const char * str) {
    if (*str == '\0') return 0;
    return StrLen(str + 1) + 1;
}

Combining everything together using define:

#define COMPILE_TIME_PAST_LAST_SLASH(STR)                                   \
    [](){                                                                   \
        struct Wrapper {                                                    \
            constexpr static const char * get() { return STR; }             \
        };                                                                  \
        using Seq = MakeSeq<PastLastOffset(0, 0, Wrapper::get()), StrLen(Wrapper::get())>; \
        return LiteralToVariadicChars<Wrapper, Seq>::type::value; \
    }()

Lambda function is to have nice, value-like feeling when using this macro. It also creates a scope for defining Wrapper structure. Generating this structure with inserted string literal using macro, leads to situation when the string literal is bounded to type.

Honestly I would not use this kind of code in production. It is killing compilers.

Both, in case of security reasons and memory usage, I would recommend using docker with custom, short paths for building.

like image 146
Grzegorz Terlikowski Avatar answered Nov 06 '22 13:11

Grzegorz Terlikowski


You can using std::string_view:

constexpr auto filename(std::string_view path)
{ 
    return path.substr(path.find_last_of('/') + 1);
}

Usage:

static_assert(filename("/home/user/src/project/src/file.cpp") == "file.cpp");
static_assert(filename("./file.cpp") == "file.cpp");
static_assert(filename("file.cpp") == "file.cpp");

See it compile (godbolt.org).

For Windows:

constexpr auto filename(std::wstring_view path)
{ 
    return path.substr(path.find_last_of(L'\\') + 1);
}
like image 37
康桓瑋 Avatar answered Nov 06 '22 14:11

康桓瑋


With C++17, you can do the following (https://godbolt.org/z/68PKcsPzs):

#include <cstdio>
#include <array>

namespace details {
template <const char *S, size_t Start = 0, char... C>
struct PastLastSlash {
    constexpr auto operator()() {
        if constexpr (S[Start] == '\0') {
            return std::array{C..., '\0'};
        } else if constexpr (S[Start] == '/') {
            return PastLastSlash<S, Start + 1>()();
        } else {
            return PastLastSlash<S, Start + 1, C..., (S)[Start]>()();
        }
    }
};
}

template <const char *S>
struct PastLastSlash {
    static constexpr auto a = details::PastLastSlash<S>()();
    static constexpr const char * value{a.data()};
};


int main() {
    static constexpr char f[] = __FILE__;
    puts(PastLastSlash<f>::value);
    return 0;
}

With C++14, it's a bit more complicated because of the more limited constexpr (https://godbolt.org/z/bzGec5GMv):

#include <cstdio>
#include <array>

namespace details {
// Generic form: just add the character to the list
template <const char *S, char ch, size_t Start, char... C>
struct PastLastSlash {
    constexpr auto operator()() {
        return PastLastSlash<S, S[Start], Start + 1, C..., ch>()();
    }
};

// Found a '/', reset the character list
template <const char *S, size_t Start, char... C>
struct PastLastSlash<S, '/', Start, C...> {
    constexpr auto operator()() {
        return PastLastSlash<S, S[Start], Start + 1>()();
    }
};

// Found the null-terminator, ends the search
template <const char *S, size_t Start, char... C>
struct PastLastSlash<S, '\0', Start, C...> {
    constexpr auto operator()() {
        return std::array<char, sizeof...(C)+1>{C..., '\0'};
    }
};
}

template <const char *S>
struct PastLastSlash {
    const char * operator()() {
        static auto a = details::PastLastSlash<S, S[0], 0>()();
        return a.data();
    }
};


static constexpr char f[] = __FILE__;
int main() {
    puts(PastLastSlash<f>{}());
    return 0;
}

With C++20, it should be possible to pass __FILE__ directly to the template instead of needing those static constexpr variables

like image 32
Nahor Avatar answered Nov 06 '22 15:11

Nahor