Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does std::string::substr throw an exception instead of returning an empty string? [closed]

Tags:

c++

string

substr

I have been wondering about the rationale behind the design of std::string's substr(pos, len) method for a while now. It still does not make sense to me, so I decided to ask the experts. The function throws a std::out_of_range exception if the pos argument exceeds the string length plus one. This can be inconvenient (even annoying) at times, but my real concern is consistency and the principle of least surprise. It turns out that the "end" position pos+len of the substring is allowed to exceed the string length plus one. Disallowing this for the beginning but not for the end feels inconsistent to me. Allowing it for the end to me hints at the interpretation

return all characters at positions pos <= i < pos+len

however, then I would expect the function to return an empty string for values of pos exceeding the string length, instead of throwing an exception. As a side note, with this interpretation it would even be sensible to allow for negative values of pos (provided it had a signed type).

This leaves me with the following questions:

  • Does this design appear logical to you? Sensible? Do you have a satisfactory way to resolve the inconsistency? The only possible explanation I can come up with is compatibility with null-terminated strings. With null termination it does not matter if the specified length exceeds the end, while starting beyond the null character is a memory bug. However, std::string is not null-terminated and instead keeps track of the length of the string. If that's the true reason then personally I'd call that a very bad one.
  • Is there an advantage in terms of performance? I would actually be surprised.
  • Am I overlooking an advantage in terms of usability? Maybe a standard idiom or use case in conjunction with other functions, like find? Also here my impression is that returning an empty string had the potential to simplify some code.
  • Is there any way to change the behavior of substr in the future? I guess no, since silently breaking existing code is must worse than living with this twist...?
like image 207
tglas Avatar asked Jul 13 '16 19:07

tglas


1 Answers

This question really too opinion-based, but I will try to answer it point by point.

  • Does this design appear logical to you? Sensible? It seems logical to me. Maybe such opinion came from strncmp-styled functions, but with such design you can just pass your buffer length for len parameter and it will work fine. But, if you're trying to access substring that is located outside your string boundaries, then you probably missed some simple sanity checks. And internal implementation of std::string doesn't matter.
  • Is there an advantage in terms of performance? I think that's not the reason.
  • Am I overlooking an advantage in terms of usability? Maybe, look at point 1.
  • Is there any way to change the behavior of substr in the future? Throwing exception on pos exceeding size() is defined in standard, so most likely no.

My point is: this exception (though I prefer to never use those) allowes you to take notice of the code that missing some elementary sanity checks, like accessing the buffer outside it's boundaries. The same design is used in at()-like functions and many other.

like image 97
Ternvein Avatar answered Oct 02 '22 01:10

Ternvein