I read The most elegant way to iterate the words of a string and enjoyed the succinctness of the answer. Now I want to do the same for string_view. Problem is, stringstream
can't take a string_view
:
#include <iostream>
#include <string>
#include <sstream>
#include <algorithm>
#include <iterator>
int main() {
using namespace std;
string_view sentence = "And I feel fine...";
istringstream iss(sentence); // <== error
copy(istream_iterator<string_view>(iss),
istream_iterator<string_view>(),
ostream_iterator<string_view>(cout, "\n"));
}
So is there a way to do this? If not, what is the reasoning such a thing would be not idiomatic?
Split by a delimiter and return a vector<string_view>
.
Designed for rapid splitting of lines in a .csv
file.
Tested under MSVC 2017 v15.9.6
and Intel Compiler v19.0
compiled with C++17
(which is required for string_view
).
#include <string_view>
std::vector<std::string_view> Split(const std::string_view str, const char delim = ',')
{
std::vector<std::string_view> result;
int indexCommaToLeftOfColumn = 0;
int indexCommaToRightOfColumn = -1;
for (int i=0;i<static_cast<int>(str.size());i++)
{
if (str[i] == delim)
{
indexCommaToLeftOfColumn = indexCommaToRightOfColumn;
indexCommaToRightOfColumn = i;
int index = indexCommaToLeftOfColumn + 1;
int length = indexCommaToRightOfColumn - index;
// Bounds checking can be omitted as logically, this code can never be invoked
// Try it: put a breakpoint here and run the unit tests.
/*if (index + length >= static_cast<int>(str.size()))
{
length--;
}
if (length < 0)
{
length = 0;
}*/
std::string_view column(str.data() + index, length);
result.push_back(column);
}
}
const std::string_view finalColumn(str.data() + indexCommaToRightOfColumn + 1, str.size() - indexCommaToRightOfColumn - 1);
result.push_back(finalColumn);
return result;
}
Be careful of lifetimes: a string_view
should never outlive the parent string
that it is a window into. If the parent string
goes out of scope, then what the string_view
points to is is invalid. In this particular case, the API design makes it difficult to go wrong as it the input/output is all string_view
which are all windows into the parent string. This ends up being rather efficient in terms of memory copying and CPU usage.
Note that if using string_view
the only downside is losing implicit null termination. So use functions that support string_view
, e.g. the lexical_cast
functions in Boost for converting strings to numbers.
I used this to rapidly parse a .csv file. To get each new line in the .csv file, I used istringstream
and getLine()
which is blazinly fast (~2GB/second or 1,200,000 lines per second on a single core).
Unit tests. Use Google Test for testing (I installed using vcpkg).
// Google Test integrates into VS2017 if ReSharper is installed.
#include "gtest/gtest.h" // Can install using vcpkg
// In main(), call:
// ::testing::InitGoogleTest(&argc, argv);return RUN_ALL_TESTS();
TEST(Strings, Split)
{
{
const std::string str = "A,B,C";
auto tokens = Split(str, ',');
EXPECT_TRUE(tokens.size() == 3);
EXPECT_TRUE(tokens[0] == "A");
EXPECT_TRUE(tokens[1] == "B");
EXPECT_TRUE(tokens[2] == "C");
}
{
const std::string str = ",B,C";
auto tokens = Split(str, ',');
EXPECT_TRUE(tokens.size() == 3);
EXPECT_TRUE(tokens[0] == "");
EXPECT_TRUE(tokens[1] == "B");
EXPECT_TRUE(tokens[2] == "C");
}
{
const std::string str = "A,B,";
auto tokens = Split(str, ',');
EXPECT_TRUE(tokens.size() == 3);
EXPECT_TRUE(tokens[0] == "A");
EXPECT_TRUE(tokens[1] == "B");
EXPECT_TRUE(tokens[2] == "");
}
{
const std::string str = "";
auto tokens = Split(str, ',');
EXPECT_TRUE(tokens.size() == 1);
EXPECT_TRUE(tokens[0] == "");
}
{
const std::string str = "A";
auto tokens = Split(str, ',');
EXPECT_TRUE(tokens.size() == 1);
EXPECT_TRUE(tokens[0] == "A");
}
{
const std::string str = ",";
auto tokens = Split(str, ',');
EXPECT_TRUE(tokens.size() == 2);
EXPECT_TRUE(tokens[0] == "");
EXPECT_TRUE(tokens[1] == "");
}
{
const std::string str = ",,";
auto tokens = Split(str, ',');
EXPECT_TRUE(tokens.size() == 3);
EXPECT_TRUE(tokens[0] == "");
EXPECT_TRUE(tokens[1] == "");
EXPECT_TRUE(tokens[2] == "");
}
{
const std::string str = "A,";
auto tokens = Split(str, ',');
EXPECT_TRUE(tokens.size() == 2);
EXPECT_TRUE(tokens[0] == "A");
EXPECT_TRUE(tokens[1] == "");
}
{
const std::string str = ",B";
auto tokens = Split(str, ',');
EXPECT_TRUE(tokens.size() == 2);
EXPECT_TRUE(tokens[0] == "");
EXPECT_TRUE(tokens[1] == "B");
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With