I need to remove the entire sentence from the string if it is containing a pattern. Here I have the pattern "Link" or "link", if it is present in the string, I need to remove the entire sentence containing it.
std::string subject = "This is previous sentence. This can be any sentences. Link 2.1.19.3 [Example]. This is can be any other sentence. This is next sentence.";
std::string removeRedundantString(std::string subject)
{
std::string removeSee = subject;
std::smatch match;
std::regex redundantSee("(Link.*$)");
if (std::regex_search(subject, match, redundantSee))
{
removeSee = std::regex_replace(subject, redundantSee, "");
}
}
Expected Output :
This is previous sentence. This can be any sentences.This is can be any other sentence. This is next sentence.
Actual Output :
This is previous sentence. This can be any sentences.
The above actual output is coming because of regex used "(Link.*$)"
which remove the sentences starting from Link to the end of the string.
I am not able to figure out what regex is used to get the expected output.
Here are the different test cases I need to test :
Testcase 1:
std::string subject = "Note this is second pattern, Ops that next the scheduler; link the amount for the full list of docs. The number of value varies from 0 to 4.";
Output: Note this is second pattern, Ops that next the scheduler;The number of value varies from 0 to 4.
Testcase 2:
std::string subject = "This is another pattern. (Link Doc::78::hello::Core::mount). Since this patern includes non-numeric value.";
Output : This is another pattern.Since this patern includes non-numeric value.
Any help would be appreciated.
I'd recommend
std::regex redundantSee(R"(\W*\b[Ll]ink\b(?:\d+(?:\.\d+)*|[^.])*[.?!])")
See its online demo. Note the raw string literal syntax, R"(...)"
. The string pattern can be simply put inside instead of ...
without any additional escaping.
Regex details:
\W*
- zero or more non-word chars\b
- a word boundary[Ll]ink
- Link
or link
word\b
- a word boundary(?:\d+(?:\.\d+)*|[^.])*
- zero or more sequences of
\d+(?:\.\d+)*
- one or more digits followed with zero or more sequences of .
and one or more digits|
- or[^.]
- any char other than a .
[.?!]
- a ?
, .
or !
.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With