Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is std::stoi actually safe to use?

I had a lovely conversation with someone about the downfalls of std::stoi. To put it bluntly, it uses std::strtol internally, and throws if that reports an error. According to them, though, std::strtol shouldn't report an error for an input of "abcxyz", causing stoi not to throw std::invalid_argument.

First of all, here are two programs tested on GCC about the behaviours of these cases:
strtol
stoi

Both of them show success on "123" and failure on "abc".


I looked in the standard to pull more info:

§ 21.5

Throws: invalid_argument if strtol, strtoul, strtoll, or strtoull reports that   no conversion could be performed. Throws out_of_range if the converted value is   outside the range of representable values for the return type. 

That sums up the behaviour of relying on strtol. Now what about strtol? I found this in the C11 draft:

§7.22.1.4

If the subject sequence is empty or does not have the expected form, no   conversion is performed; the value of nptr is stored in the object   pointed to by endptr, provided that endptr is not a null pointer. 

Given the situation of passing in "abc", the C standard dictates that nptr, which points to the beginning of the string, would be stored in endptr, the pointer passed in. This seems consistent with the test. Also, 0 should be returned, as stated by this:

§7.22.1.4

If no conversion could be performed, zero is returned. 

The previous reference said that no conversion would be performed, so it must return 0. These conditions now comply with the C++11 standard for stoi throwing std::invalid_argument.


The result of this matters to me because I don't want to go around recommending stoi as a better alternative to other methods of string to int conversion, or using it myself as if it worked the way you'd expect, if it doesn't catch text as an invalid conversion.

So after all of this, did I go wrong somewhere? It seems to me that I have good proof of this exception being thrown. Is my proof valid, or is std::stoi not guaranteed to throw that exception when given "abc"?

like image 542
chris Avatar asked Jul 22 '12 09:07

chris


People also ask

What can I use instead of stoi in C++?

std::stringstream can be used to convert std::string to other data types and vice versa. This suffers from the same problem as std::stoi did, i.e., it will convert the strings like 10xyz to integer 10 . It returns INT_MAX or INT_MIN if the converted value is out of the range of integer data type.

What does STD stoi do?

std::stoi. Parses str interpreting its content as an integral number of the specified base, which is returned as an int value. If idx is not a null pointer, the function also sets the value of idx to the position of the first character in str after the number.

Does stoi throw?

std::stoi may throw exceptions so it needs to be surrounded by try/catch. In applications where std::stoi may be used frequently, it could be useful to have a wrapper.


1 Answers

Does std::stoi throw an error on the input "abcxyz"?

Yes.

I think your confusion may come from the fact that strtol never reports an error except on overflow. It can report that no conversion was performed, but this is never referred to as an error condition in the C standard.

strtol is defined similarly by all three C standards, and I will spare you the boring details, but it basically defines a "subject sequence" that is a substring of the input string corresponding to the actual number. The following four conditions are equivalent:

  • the subject sequence has the expected form (in plain English: it is a number)
  • the subject sequence is non-empty
  • a conversion has occurred
  • *endptr != nptr (this only makes sense when endptr is non-null)

When there is an overflow, the conversion is still said to have occurred.

Now, it is quite clear that because "abcxyz" does not contain a number, the subject sequence of the string "abcxyz" must be empty, so that no conversion can be performed. The following C90/C99/C11 program will confirm it experimentally:

#include <stdio.h> #include <stdlib.h>  int main() {     char *nptr = "abcxyz", *endptr[1];     strtol(nptr, endptr, 0);     if (*endptr == nptr)         printf("No conversion could be performed.\n");     return 0; } 

This implies that any conformant implementation of std::stoi must throw invalid_argument when given the input "abcxyz" without an optional base argument.


Does this mean that std::stoi has satisfactory error checking?

No. The person you were talking to is correct when she says that std::stoi is more lenient than performing the full check errno == 0 && end != start && *end=='\0' after std::strtol, because std::stoi silently strips away all characters starting from the first non-numeric character in the string.

In fact off the top of my head the only language whose native conversion behaves somewhat like std::stoi is Javascript, and even then you have to force base 10 with parseInt(n, 10) to avoid the special case of hexadecimal numbers:

input      |  std::atoi       std::stoi      Javascript      full check  ===========+============================================================= hello      |  0               error          error(NaN)      error       0xygen     |  0               0              error(NaN)      error       0x42       |  0               0              66              error       42x0       |  42              42             42              error       42         |  42              42             42              42          -----------+------------------------------------------------------------- languages  |  Perl, Ruby,     Javascript     Javascript      C#, Java,              |  PHP, C...       (base 10)                      Python...   

Note: there are also differences among languages in the handling of whitespace and redundant + signs.


Ok, so I want full error checking, what should I use?

I'm not aware of any built-in function that does this, but boost::lexical_cast<int> will do what you want. It is particularly strict since it even rejects surrounding whitespace, unlike Python's int() function. Note that invalid characters and overflows result in the same exception, boost::bad_lexical_cast.

#include <boost/lexical_cast.hpp>  int main() {     std::string s = "42";     try {         int n = boost::lexical_cast<int>(s);         std::cout << "n = " << n << std::endl;     } catch (boost::bad_lexical_cast) {         std::cout << "conversion failed" << std::endl;     } } 
like image 129
Generic Human Avatar answered Oct 05 '22 20:10

Generic Human