Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Implementation of string_view formatted stream ouput

While implementing C++1z's std::basic_string_view to use it on older compilers, I encountered a problem with the stream output operator overload for it. Basically, it has to output the contents referenced by the string_view while not relying on any null-terminator being present (as string_view is not guarantueed to be null-terminated).

Normally, writing overloads for operator<< is quite easy since you can rely on already present overloads, and thus do not need to use sentry objects as mentioned in this question on SO.

But in this case, there is no predefined overload for operator<< taking a character pointer and a length (obviously). Because of this I create a temporary std::string instance in my current implementation:

template< typename TChar, typename TTraits >
auto operator<<(::std::basic_ostream<TChar, TTraits>& p_os, basic_string_view<TChar, TTraits> p_v)
    -> ::std::basic_ostream<TChar, TTraits>&
{
    p_os << p_v.to_string(); // to_string() returns a ::std::string.
    return p_os;
}

This works, but I really dislike the fact that I have to create a temporary std::string instance, because that entails redundantly copying the data and potential usage of dynamic memory. This, in my opinion at least, defeats the purpose of using a lightweight reference type.

So my question is:

What is the best way to implement correct formatted output for my string_view without the overhead?


While researching, I found that LLVM does it like this: (found here)

// [string.view.io]
template<class _CharT, class _Traits>
basic_ostream<_CharT, _Traits>&
operator<<(basic_ostream<_CharT, _Traits>& __os, basic_string_view<_CharT, _Traits> __sv)
{
    return _VSTD::__put_character_sequence(__os, __sv.data(), __sv.size());
}

The implementation of __put_character_sequence resides in this file, but it makes heavy use of internal functions to do the formatting. Do I need to reimplement all formatting by myself?

like image 853
nshct Avatar asked Sep 23 '16 05:09

nshct


1 Answers

As far as I can see, you'll have to handle this yourself.

Fortunately, the formatting you need to do for a string-like item is fairly minimal--mostly inserting padding before or after the string if needed.

  • To figure out if padding is needed, you'll need to retrieve the stream's current field with using ios_base::width().
  • To figure out whether to insert that before or after you write out the string, you'll need to retrieve the left/right flags with ios_base::fmtflags().
  • To figure out what to insert as the padding, you can call ios_base::fill().
  • Finally, I believe you'll need to check the fixed flag--if memory serves, with it set, you need to truncate your string if it's longer than the current field width.

So (with an ultra-simplified implementation of string_view), code might look something like this:

#include <iostream>
#include <iomanip>
#include <ios>
#include <sstream>

class string_view { 
    char const *data;
    size_t len;
public:
    string_view(char const *data, size_t len) : data(data), len(len) {}

    friend std::ostream &operator<<(std::ostream &os, string_view const &sv) { 
        std::ostream::sentry s{ os };
        if (s) {
            auto fill = os.fill();
            auto width = os.width();
            bool left = os.flags() & std::ios::left;
            bool right = os.flags() & std::ios::right;
            bool fixed = os.flags() & std::ios::fixed;

            auto pad = [&](size_t width) { while (width--) os.put(fill); };

            if (sv.len < width) {
                auto padding_len = width - sv.len;
                if (right) pad(padding_len);
                os.write(sv.data, sv.len);
                if (left) pad(padding_len);
            }
            else {
                os.write(sv.data, fixed ? width : sv.len);
            }
        }
        os.width(0);
        return os;
    }
};

#ifdef TEST   
void check(std::stringstream &a, std::stringstream &b) {
    static int i;

    ++i;
    if (a.str() != b.str()) {
        std::cout << "Difference in test:" << i << "\n";
        std::cout << "\"" << a.str() << "\"\n";
        std::cout << "\"" << b.str() << "\"\n";
    }
    a.seekp(0);
    b.seekp(0);
}

int main() { 
    char string[] = "Now is the time for every good man to come to the aid of Jerry.";

    std::stringstream test1;
    std::stringstream test2;

    test1 << string_view(string, 3);
    test2 << std::string(string, 3);
    check(test1, test2);

    test1 << string_view(string + 4, 2);
    test2 << string_view(string + 4, 2);
    check(test1, test2);

    test1 << std::setw(10) << std::left << string_view(string, 6);
    test2 << std::setw(10) << std::left << std::string(string, 6);
    check(test1, test2);

    test1 << std::setw(10) << std::right << string_view(string, 6);
    test2 << std::setw(10) << std::right << std::string(string, 6);
    check(test1, test2);

    test1 << std::setw(10) << std::right << string_view(string, sizeof(string));
    test2 << std::setw(10) << std::right << std::string(string, sizeof(string));
    check(test1, test2);

    test1 << std::setw(10) << std::right << std::fixed << string_view(string, sizeof(string));
    test2 << std::setw(10) << std::right << std::fixed << std::string(string, sizeof(string));
    check(test1, test2);
}
#endif

Oh--one more detail. Since we're only writing to the stream, not directly to the underlying buffer, I think we probably don't actually need to create the sentry object in this case. As shown, creating and using it is pretty trivial, but it would undoubtedly be at least some tiny bit faster with it removed.

like image 62
Jerry Coffin Avatar answered Nov 05 '22 15:11

Jerry Coffin