Perhaps I'm overlooking something obvious but I was wondering what the fastest way to implement whole-word string replacement in C++ might be. At first I considered simply concatenating spaces to the search word, but this does not consider the string boundaries or punctuation.
This is my current abstraction for (non-whole-word) replacement:
void Replace(wstring& input, wstring find, wstring replace_with) {
if (find.empty() || find == replace_with || input.length() < find.length()) {
return;
}
for (size_t pos = input.find(find);
pos != wstring::npos;
pos = input.find(find, pos)) {
input.replace(pos, find.length(), replace_with);
pos += replace_with.length();
}
}
If I only consider spaces as a word boundary, I could probably implement this by comparing the beginning and end of the search string against the find string to cover the string boundaries, and then following with a Replace(L' ' + find + L' ').... but I was wondering if there was a more elegant solution that would include punctuation efficiently.
Let's consider a word to be any collection of characters that is separated by either whitespace or punctuation (to keep it simple let's say !"#$%&'()*+,-./ at minimum -- which happen to correspond to (c > 31 && c < 48)
).
In my application I have to call this function over a rather large array of short strings, which may include various Unicode which I don't want to split new words. I would also like to avoid including any external libraries, but STL is fine.
The purpose of not using regular expressions is the promise of less overhead and the goal of a fast function suited to this particular task over a large dataset.
I think you can do this, both doing whole-word matching and doing it efficiently. The key is to:
Here's my take on your function:
#include <cctype> // isalpha
#include <ciso646> // or, not
#include <string> // wstring
using std::size_t;
using std::wstring;
/// @brief Do a "find and replace" on a string.
/// @note This function does "whole-word" matching.
/// @param[in,out] input_string The string to operate on.
/// @param[in] find_string The string to find in the input.
/// @param[in] replace_string The string to replace 'find_string'
/// with in the input.
void find_and_replace( wstring& input_string,
const wstring& find_string,
const wstring& replace_string )
{
if( find_string.empty()
or find_string == replace_string
or input_string.length() < find_string.length() )
{
return;
}
wstring output_string;
output_string.reserve( input_string.length() );
size_t last_pos = 0u;
for( size_t new_pos = input_string.find( find_string );
new_pos != wstring::npos;
new_pos = input_string.find( find_string, new_pos ) )
{
bool did_replace = false;
if( ( new_pos == 0u
or not std::isalpha( input_string.at( new_pos - 1u ) ) )
and ( new_pos + find_string.length() == input_string.length()
or not std::isalpha( input_string.at( new_pos + find_string.length() ) ) ) )
{
output_string.append( input_string, last_pos, new_pos - last_pos );
output_string.append( replace_string );
did_replace = true;
}
new_pos += find_string.length();
if( did_replace )
{
last_pos = new_pos;
}
}
output_string.append( input_string, last_pos,
input_string.length() - last_pos );
input_string.swap( output_string );
}
P.S. I was unsure what 'replace_all' was trying to accomplish in your initial example, so I removed it from my solution for clarity.
P.P.S. This code would be much cleaner with Regex-es. Can you rely on C++ TR1 or C++ 2011 functionality? They provide a standard 'regex' library.
This is my quick response, but don't know about how quick the solution is...
There are few solutions to this problem:
1. By using iterators, compare each word (delimited by space), recreating string for each occurrence:
string& remove_all_occurences(string& s, const string& str_to_remove, const string& str_to_put){
typedef string::size_type string_size;
string_size i = 0;
string cur_string;
cur_string.reserve(s.size());
// invariant: we have processed characters [original value of i, i)
while (i != s.size()) {
// ignore leading blanks
// invariant: characters in range [original i, current i) are all spaces
while (i != s.size() && isspace(s[i]))
++i;
// find end of next word
string_size j = i;
// invariant: none of the characters in range [original j, current j)is a space
while (j != s.size() && !isspace(s[j]))
j++;
// if we found some nonwhitespace characters
if (i != j) {
// copy from s starting at the beginning to i, placing str to replace, and finishing with j to the end of s
cur_string = s.substr(i,j-i);
if(cur_string == str_to_remove){
s = s.substr(0,i) + str_to_put + s.substr(j,s.size() - j);
}
i = j;
}
}
return s;
}
Testing the program:
void call_remove_all_occurences(){
string my_str = "The quick brown fox jumps over sleepy dog fox fox fox";
cout << remove_all_occurences(my_str,"fox","godzilla") << endl;
}
Output:
The quick brown godzilla jumps over sleepy dog godzilla godzilla godzilla
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With