I have read the latest draft where lazy_split_view
is added.
But later on, I realized that split_view
was renamed into lazy_split_view
, and the split_view
was renewed.
libstdc++
also recently implemented this by using GCC Trunk
version https://godbolt.org/z/9qG5T9n5h
I have a simple naive program here that shows the usage of two views, but I can't see their differences:
#include <iostream>
#include <ranges>
int main(){
std::string str { "one two three four" };
for (auto word : str | std::views::split(' ')) {
for (char ch : word)
std::cout << ch;
std::cout << '.';
}
std::cout << '\n';
for (auto word : str | std::views::lazy_split(' ')) {
for (char ch : word)
std::cout << ch;
std::cout << '.';
}
}
Output:
one.two.three..four.
one.two.three..four.
until I've noticed the differences when using as std::span<const char>
for both views.
In the first one: std::views::split
:
for (std::span<const char> word : str | std::views::split(' '))
the compiler accepts my code.
While in the second one: std::views::lazy_split
for (std::span<const char> word : str | std::views::lazy_split(' '))
throws compilation errors.
I know there will be differences between these two, but I can't easily spot them. Is this a defect report in C++20 or a new feature in C++23 (with changes), or both?
Split View is especially useful when a user needs to review or edit records that meet certain criteria and can be sorted/filtered, then worked on or viewed. A user can only Sort fields in Table View, not the Split View. A user can only Filter fields in Table View, not the Split View.
The expression views::lazy_split(e, f) is expression-equivalent to lazy_split_view(e, f). 3) The exposition-only concept /*tiny_range*/<Pattern> is satisfied if Pattern satisfies sized_range, Pattern::size() is a constant expression and suitable as a template non-type argument, and the value of Pattern::size() is less than or equal to 1.
Before P2210R2, split_view used a lazy mechanism for splitting, and thus could not keep the bidirectional, random access, or contiguous properties of the underlying view, or make the iterator type of the inner range same as that of the underlying view. Consequently, it is redesigned by P2210R2, and the lazy mechanism is moved to lazy_split_view .
lazy_split_view models the concepts forward_range and input_range when the underlying view V models respective concepts, and models common_range when V models both forward_range and common_range .
I've looked at the relevant paper (P2210R2 from Barry Revzin) and split_view
has been renamed to lazy_split_view
. The new split_view
is different in that it provides you with a different result type that preserves the category of the source range.
For example, our string str
is a contiguous range, so split
will yield a contiguous subrange. Previously it would only give you a forward range. This can be bad if you try to do multi-pass operations or get the address to the underlying storage.
From the example of the paper:
std::string str = "1.2.3.4";
auto ints = str
| std::views::split('.')
| std::views::transform([](auto v){
int i = 0;
std::from_chars(v.data(), v.data() + v.size(), i);
return i;
});
will work now, but
std::string str = "1.2.3.4";
auto ints = str
| std::views::lazy_split('.')
| std::views::transform([](auto v){
int i = 0;
// v.data() doesn't exist
std::from_chars(v.data(), v.data() + v.size(), i);
return i;
});
won't because the range v
is only a forward range, which doesn't provide a data()
member.
I was under the impression that split
must be lazy as well (laziness was one of the selling points of the ranges proposal after all), so I made a little experiment:
struct CallCount{
int i = 0;
auto operator()(auto c) {
i++;
return c;
}
~CallCount(){
if (i > 0) // there are a lot of copies made when the range is constructed
std::cout << "number of calls: " << i << "\n";
}
};
int main() {
std::string str = "1 3 5 7 9 1";
std::cout << "split_view:\n";
for (auto word : str | std::views::transform(CallCount{}) | std::views::split(' ') | std::views::take(2)) {
}
std::cout << "lazy_split_view:\n";
for (auto word : str | std::views::transform(CallCount{}) | std::views::lazy_split(' ') | std::views::take(2)) {
}
}
This code prints (note that the transform
operates on each char in the string):
split_view:
number of calls: 6
lazy_split_view:
number of calls: 4
So what happens?
Indeed, both views are lazy. But there are differences in their laziness. The transform
that I put in front of split
just counts how many times it has been called. As it turns out split
computes the next item eagerly, while lazy_split
stops as soon as it hits the whitespace after the current item.
You can see that the string str
consists of numbers that also mark their char index (starting at 1). The take(2)
should stop the loop after we've seen '3' in str
. And indeed lazy_split
stops at the whitespace after '3', but split
stops at the whitespace after '5'.
This esentially means that split
fetches its next item eagerly instead of lazy. This difference probably shouldn't matter most of the time but it can impact performance critical code.
I don't know whether that was the reason for this change (I haven't read the paper).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With