Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Boost find_first how does it work? / Define a range

I have a buffer ( e.g. char buffer[1024] ) which gets filled with some data. Now I want to search for a substring in this buffer. Since it should be a case insenstive search I am using boost::algorithm::ifind_first.

So I call the function like this:

boost::iterator_range<char*> buf_iterator;
buf_iterator = boost::algorithm::ifind_first(buffer ,"substring");

This actually works fine. But my concern is the following:

I pass the function just a char pointer, so ifind_first should have no idea where my buffer ends, but it still works tho.

Now my first idea was that the function searches until a string-termination character. But in the Boost Documentation the function is defined like this:

template<typename Range1T, typename Range2T> 
  iterator_range< typename range_iterator< Range1T >::type > 
  find_first(Range1T & Input, const Range2T & Search);

Since it works with template parameters I actually doubt that it is working with null termination?

So my question is how does ifind_first know where to stop? Or to be more precise, how can I give it a range? As already mentioned it works just fine with a char* but I'm not quite sure if I wasn't just lucky - I mean in the worst case the function is called and doesn't know where to stop and goes into undefined memory...

Edit:

Now in an answer there was mentioned that it depends on the type I pass to the function. Now this would mean if I work with a char buffer I have to always make sure it`s 0-terminated...?

like image 557
Toby Avatar asked Feb 20 '12 13:02

Toby


1 Answers

It uses a technique where the length of an array is a template argument, ie:

template< typename T, size_t L >
void foo( T (&arr)[L] )
{
}

As a string literal has known length L can be deduced, such as foo( "test" ) being foo< char, 5 >(). I bet there's an overload for const char* where it's assumed that the argument is a c-string, where strlen() can be used to determine the length.

EDIT: Better explanation demonstration how ifind_first will fail, and why it won't if you're careful

What decides whether ifind_first will fail or not in this case is whether either subject or search degenerates into a char*. In this case you've passed a string literal as the search directly, ifind_first will try and guess will conclude that it's const char[ 10 ] ( length of "substring" + 1 for NULL terminator ). However, for the search it does not matter, because even if it degenerates to const char* ifind_first will guess that it's a NULL terminated c string, and a string literal is a NULL terminated c string an therefor works dandy.

In this case you're really asking for char buffer[1024], in your case it does not degenerate to char*. But if instead you would've had lets say char* buffer = new char[1024]; the type of buffer is char* and it's not guaranteed to be NULL terminated. In this case ifind_first will fail in mysterious ways depending on what's after the area you've filled.

So, to conclude, as the type of buffer is char[1024] in your case it will not touch memory past the end of buffer, BUT, it will also not care about whether there's a NULL terminator in there ( it doesn't look for it, as you've passed it a char[1024] it knows the length at compile time ). So if lets say you fill buffer with 12 characters followed by NULL it will still search the whole buffer.

like image 106
Ylisar Avatar answered Sep 29 '22 07:09

Ylisar