During a code review, i found source code like this:
void f_odd(std::string &className, std::string &testName)
{
if (className.find("::") != std::string::npos)
{
testName = className.substr(className.find("::") + 2);
(void)className.erase(className.find("::"), std::string::npos);
}
}
Within this function, std::string::find() is called three times with the same pattern( here "::").
This code can of course be refactored to
void f(std::string &className, std::string &testName)
{
const size_t endOfClassNamePos = className.find("::");
if (endOfClassNamePos != std::string::npos)
{
testName = className.substr(endOfClassNamePos + 2);
(void)className.erase(endOfClassNamePos, std::string::npos);
}
}
where find is called only once.
Question
Does anybody know a strategy to detecting such a pattern like this? I am having a huge code base, where i intend to spot this pattern. I plan to use a Windows or a Linux environment.
Potential Strategies
No Go's
Update 1
I decided to start with potential strategy 1). I plan to adapt cppcheck to catch this issue.
Cppcheck offers a possibility to write customized rules, based on PCRE regular expressions. For this, cppcheck has to be compiled with enabled PCRE support. Since the current test environment is Linux-based, the following commands can be used to download the latest version of cppcheck:
git clone https://github.com/danmar/cppcheck.git && cd cppcheck
After that, compile and install the tool as follows:
sudo make install HAVE_RULES=yes
Now the basic tool setup is done. In order to develop a cppcheck-rule, i prepared a simple test case (file: test.cpp), similar to the sample code in the first section of this article. This file contains three functions and the cppcheck-rule shall emit a warning on f_odd
and f_odd1
about consecutive identical std::string::find
calls.
test.cpp:
#include <string>
void f(std::string &className, std::string &testName)
{
const size_t endOfClassNamePos = className.find("::");
if (endOfClassNamePos != std::string::npos)
{
testName = className.substr(endOfClassNamePos + 2);
(void)className.erase(endOfClassNamePos, std::string::npos);
}
}
void f_odd(std::string &className, std::string &testName)
{
if (className.find("::") != std::string::npos)
{
testName = className.substr(className.find("::") + 2);
(void)className.erase(className.find("::"), std::string::npos);
}
}
#define A "::"
#define B "::"
#define C "::"
void f_odd1(std::string &className, std::string &testName)
{
if (className.find(A) != std::string::npos)
{
testName = className.substr(className.find(B) + 2);
(void)className.erase(className.find(C), std::string::npos);
}
}
So far so good. Now cppcheck has to be tweaked to catch consecutive identical std::string::find
calls. For this i have created a cppcheck_rule-file that contains a regular expression that matches consecutive identical std::string::find
calls:
<?xml version="1.0"?>
<rule>
<tokenlist>normal</tokenlist>
<pattern><![CDATA[([a-zA-Z][a-zA-Z0-9]*)(\s*\.\s*find)(\s*\(\s*\"[ -~]*\"\s*\))[ -\{\n]*(\1\2\3)+[ -z\n]]]></pattern>
<message>
<severity>style</severity>
<summary>Found identical consecutive std::string::find calls.</summary>
</message>
This file can be used to extend cppcheck about a new check. Lets try:
cppcheck --rule-file=rules/rule.xml test/test.cpp
and the output is
Checking test/test.cpp...
[test/test.cpp:14]: (style) Found identical consecutive std::string::find calls.
[test/test.cpp:26]: (style) Found identical consecutive std::string::find calls.
Now, identical consecutive std::string::find
calls can be detected in C/C++ codes. Does anybody know a better/more efficient or more clever solution?
References:
The main problem with such a tool is that a lexical analysis can only check if there is textual repetition. Taking your example, calling className.find("::")
twice is a potential issue if the variable refers to the same string twice. But let me add one small change to your code: className = className.substr(className.find("::") + 2);
. Suddenly the meaning of the next className.find
has changed dramatically.
Can you find such changes? You need a full-blown compiler for that, and even then you have to be pessimistic. Sticking to your example, could className
be changed via an iterator? It's not just direct manipulation you need to be aware of.
Is there no positive news? Well: existing compilers do have a similar mechanism. It's called Common Subexpression Elimination, and it works conceptually as you would want it to work in the example above. But that is also bad news in one way: If the situation is detectable, it's unimportant because it's already optimized out by the compiler!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With