I currently have this kind of loop
while(1)
{
generate_string(&buffer);
for(int i = 0; i < filelines; i++)
{
if(strcmp(buffer,line[i]) == 0)
{
/* do something */
}
}
}
I have a file with a few million strings(which hopefully should be cut by half sometime soon), the number of all these strings is stored in filelines
line[i] is basically where the string itself is stored.
Currently, due to the comparison of these million strings, function generate_string(&buffer); is executed around 42 times per second. Is there a faster way to do string comparison in C?
If n is sufficiently large that strncmp will compare the whole strings (so that the behavior becomes effectively the same as strcmp ) then strncmp is likely to be moderately slower because it also has to keep track of a counter, but the difference might or might not be measurable or even present in a given ...
While comparing the two function, in a loop repeted 500'000'000 times, strcmp execute too much fast, about x750 times faster. The execution time of that function is 3.058s , while strcmp only 0.004s .
The strcmp() function, is used to compare the strings (str1,str2). The strings str1 and str2 will be compared using this function. If the function returns a value 0, it signifies that the strings are equal otherwise, strings are not equal.
You can't compare strings in C with ==, because the C compiler does not really have a clue about strings beyond a string-literal.
strcmp
is usually optimized by all vendors. However, if you're not satisfied with this you can try:
libc
used to have this optimization for small strings where they tested strings smaller than five bytes as integers. MS cl
also has some optimizations for small-strings (do look it up).But more importantly make sure strcmp
is your real bottleneck.
I benchmarked against using strcmp()
and a macro to compare byte by byte. The macro version is "very" much faster when compared to that of strcmp()
. Usually for string comparation, its much faster to use the byte comparator macro, rather than strcmp()
.
Ex:
#define str3_cmp(m, c0, c1, c2, c3) m[0] == c0 && m[1] == c1 && m[2] == c2 && m[3] == c3
This is "very" faster when compared to that of strcmp()
. But writing those down is a pain, and your required to split out the strings char by char, so I have written a handy PHP script to Generate that as a header file for you.
You can use this string compare in a hot loop where you know exactly the size of char*
you would be comparing.
#!/usr/bin/php
<?php
function generate_macro($num) : string {
$returner = "#define str".$num."cmp_macro(ptr, ";
for($x = 0; $x < $num; $x++){
$returner .= "c".$x;
if($x != $num-1){ $returner .= ", "; }
}
$returner .= ") ";
for($x = 0; $x < $num; $x++){
$returner .= "*(ptr+".$x.") == c".$x;
if($x != $num-1){ $returner .= " && "; }
}
return $returner;
}
function generate_static_inline_fn(&$generated_macro, $num) : string {
$generated_macro .= "static inline bool str".$num."cmp(const char* ptr, const char* cmp)".
"{\n\t\treturn str".$num."cmp_macro(ptr, ";
for($x = 0; $x < $num; $x++){
$generated_macro .= " *(cmp+".$x.")";
if($x != $num-1){ $generated_macro .= ", "; }
}
$generated_macro .= ");\n}\n";
return $generated_macro;
}
function handle_generation($argc, $argv) : void {
$out_filename = $argv[$argc-1];
$gen_macro = "";
for($x = 0; $x < $argc-2; $x++){
$macro = generate_macro($argv[$x+1])."\n";
$gen_macro .= generate_static_inline_fn($macro, $argv[$x+1]);
}
file_put_contents($out_filename, $gen_macro);
}
handle_generation($argc, $argv);
?>
This script takes Two arguments.
char*
you would be comparing.Example: $ ./gen_faststrcmp.php 3 5 fast_strcmp.h
And this will generate fast_strcmp.h
with contents
#define str3cmp_macro(ptr, c0, c1, c2) *(ptr+0) == c0 && *(ptr+1) == c1 && *(ptr+2) == c2
static inline bool str3cmp(const char* ptr, const char* cmp){
return str3cmp_macro(ptr, *(cmp+0), *(cmp+1), *(cmp+2));
}
#define str5cmp_macro(ptr, c0, c1, c2, c3, c4) *(ptr+0) == c0 && *(ptr+1) == c1 && *(ptr+2) == c2 && *(ptr+3) == c3 && *(ptr+4) == c4
static inline bool str5cmp(const char* ptr, const char* cmp){
return str5cmp_macro(ptr, *(cmp+0), *(cmp+1), *(cmp+2), *(cmp+3), *(cmp+4));
}
And in your code you can use the function like so,
const char* compare_me = "Hello";
if(str5cmp(compare_me, "Hello")) { /* code goes here */ }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With