In my C++ application I need to remove all dots, commas, exclamation marks
and to lower case the string.
So far I figured out I can do it with std::erase
and std::remove
like this:
string content = "Some, NiceEeeE text ! right HeRe .";
content.erase(std::remove(content.begin(), content.end(), ','), content.end());
content.erase(std::remove(content.begin(), content.end(), '.'), content.end());
content.erase(std::remove(content.begin(), content.end(), '!'), content.end());
std::transform(content.begin(), content.end(), content.begin(), ::tolower);
So my question is can I do this without iterating 4 times throught the string? Are there better ways to do this with simple C++?
Ignoring iterations performed inside std::remove
and erase
(which you already do), you can use std::remove_if
and provide your own custom predicate:
#include <algorithm>
content.erase(std::remove_if(content.begin(),
content.end(),
[](char c)
{ return c==','||c=='.'|| c=='!'; }
content.end());
Then you can then use std::transform
to transform the remaining string to lower case:
#include <cctype>
#include <algorithm>
std::transform(contents.begin(),
contents.end(),
contents.begin(),
[] (unsigned char c) { return std::tolower(c); }));
Try this
string result;
for (int loop = 0; loop < content.length(); ++loop) {
switch (content[loop]) {
case ',':
case '!':
case '.':
break;
default:
result += static_case<unsigned char>(tolower(content[loop]));
}
}
This sounds like a conditional std::transform
so you could do:
template <typename InIt, typename OutIt, typename UnOp, typename Pred>
OutIt transform_if(InIt first, InIt last, OutIt dest, UnOp op, Pred pr)
{
while (first != last) {
if (pr(*first)) {
*dest = op(*first);
++dest;
}
++first;
}
return dest;
}
Usage in this case would be:
content.erase(transform_if(
content.begin(), content.end(),
content.begin(),
[](char c){ return std::tolower(c, std::locale()); },
[](char c){ return !(c == ',' || c == '.'); }
), content.end());
If you want to do this in a single pass, it's pretty easy to do with a standard for
loop. Using standard library routines might be preferred in general, but if you want it done in a single pass and there's not a good fit in the library, then I see no harm in just using a loop.
#include <iostream>
#include <ostream>
#include <string>
using namespace std;
int main()
{
string exclude_chars(",.!");
string content = "Some, NiceEeeE text ! right HeRe .";
auto write_iter = content.begin();
for (auto read_iter = content.begin(); read_iter != content.end(); ++read_iter) {
auto c = *read_iter;
if (exclude_chars.find(c) != string::npos) continue;
*write_iter = tolower( (unsigned char) c);
++write_iter;
}
content.erase(write_iter, content.end());
cout << content << endl;
}
If you need this functionality in more than one pace and/or need the exclusion characters or transformation to be parameterized, then its also pretty easy to turn that snippet of code into a function that takes those things as argument.
For example, here's a template function that does the filter and transform in one pass:
#include <ctype.h>
#include <iostream>
#include <ostream>
#include <string>
template <class InputIter, class OutputIter, class UnaryOp, class UnaryPred>
OutputIter filter_and_transform(
InputIter first,
InputIter last,
OutputIter result,
UnaryPred pred,
UnaryOp op)
{
while (first!=last) {
if (pred(*first)) {
*result = op(*first);
++result;
}
++first;
}
return result;
}
int main()
{
std::string exclude_chars(",.!");
std::string content = "Some, NiceEeeE text ! right HeRe .";
content.erase(
filter_and_transform( begin(content), end(content),
begin(content),
[](char c) {
return std::string(",.!").find(c) == std::string::npos;
},
[](char c) -> char {
return tolower((unsigned char) c);
}),
end(content)
);
std::cout << content << std::endl;
}
It's more generic, but I'm not convinced it's more readable.
Update (29 Apr 2014)
I decided to play around with the idea of having a custom filter_iterator<>
perform the filtering, and when I got frustrated over the amount of boilerplate code I had to get working I figured I'd look into whether Boost had anything similar. Sure enough boost has exactly that data type and a transform_iterator
that can be composed together to get the following alternate single pass filter-and-transform operation:
// boost::transform_iterator<> might need the following define
// in order to work with lambdas (see http://stackoverflow.com/questions/12672372)
#define BOOST_RESULT_OF_USE_DECLTYPE
#include <algorithm>
#include <ctype.h>
#include <iostream>
#include <ostream>
#include <string>
#include "boost/iterator/filter_iterator.hpp"
#include "boost/iterator/transform_iterator.hpp"
/*
relaxed_copy<>() works like std::copy<>() but is safe to use in
situations where result happens to be equivalent to first.
std::copy<> requires that result not be in the range [first,last) - it's
understandable that result cannot be in the range [first,last) in general,
but it should be safe for the specific situation where result == first.
However, the standard doesn't allow for this particular exception, so
relaxed_copy<>() exists to be able to safely handle that scenario.
*/
template <class InputIter, class OutputIter>
OutputIter relaxed_copy(
InputIter first,
InputIter last,
OutputIter result)
{
while (first!=last) {
*result = *first;
++first;
++result;
}
return result;
}
int main()
{
std::string exclude_chars(",.!");
std::string content = "Some, NiceEeeE text ! right HeRe .";
// set up filter_iterators over the string to filter out ",.!" characters
auto filtered_first =
boost::make_filter_iterator(
[&exclude_chars](char c) {
return exclude_chars.find(c) == std::string::npos;
},
begin(content),
end(content)
);
auto filtered_last =
boost::make_filter_iterator(
filtered_first.predicate(),
end(content)
);
// set up transform_iterators 'on top of' the filter_iterators
// to transform the filtered characters to lower case
auto trans_first =
boost::make_transform_iterator(
filtered_first,
[](char c) -> char {
return tolower((unsigned char) c);
}
);
auto trans_last =
boost::make_transform_iterator(
filtered_last,
trans_first.functor()
);
// now copy using the composed iterators and erase any leftovers
content.erase(
relaxed_copy( trans_first, trans_last, begin(content)),
end(content)
);
std::cout << content << std::endl;
}
I think this is a pretty nifty technique, but I still think it might be hard to argue that it's understandable at a glance what's going on.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With