Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

int vs size_t on 64bit

Porting code from 32bit to 64bit. Lots of places with

int len = strlen(pstr);

These all generate warnings now because strlen() returns size_t which is 64bit and int is still 32bit. So I've been replacing them with

size_t len = strlen(pstr);

But I just realized that this is not safe, as size_t is unsigned and it can be treated as signed by the code (I actually ran into one case where it caused a problem, thank you, unit tests!).

Blindly casting strlen return to (int) feels dirty. Or maybe it shouldn't?
So the question is: is there an elegant solution for this? I probably have a thousand lines of code like that in the codebase; I can't manually check each one of them and the test coverage is currently somewhere between 0.01 and 0.001%.

like image 596
MK. Avatar asked Mar 25 '10 21:03

MK.


People also ask

Is it better to use Size_t or int?

When writing C code you should always use size_t whenever dealing with memory ranges. The int type on the other hand is basically defined as the size of the (signed) integer value that the host machine can use to most efficiently perform integer arithmetic.

Is Size_t always 64 bits?

size_t typeOn a 32-bit system size_t will take 32 bits, on a 64-bit one 64 bits. In other words, a variable of size_t type can safely store a pointer.

Is Size_t 32-bit or 64-bit?

size_t , time_t , and ptrdiff_t are 64-bit values on 64-bit Windows operating systems. time_t is a 32-bit value on 32-bit Windows operating systems in Visual Studio 2005 and earlier. time_t is now a 64-bit integer by default.


1 Answers

Some time ago I posted a short note about this kind of issues on my blog and the short answer is:

Always use proper C++ integer types

Long answer: When programming in C++, it’s a good idea to use proper integer types relevant to particular context. A little bit of strictness always pays back. It’s not uncommon to see a tendency to ignore the integral types defined as specific to standard containers, namely size_type. It’s available for number of standard container like std::string or std::vector. Such ignorance may get its revenge easily.

Below is a simple example of incorrectly used type to catch result of std::string::find function. I’m quite sure that many would expect there is nothing wrong with the unsigned int here. But, actually this is just a bug. I run Linux on 64-bit architecture and when I compile this program as is, it works as expected. However, when I replace the string in line 1 with abc, it still works but not as expected :-)

#include <iostream>
#include <string>
using namespace std;
int main()
{
  string s = "a:b:c"; // "abc" [1]
  char delim = ':';
  unsigned int pos = s.find(delim);
  if(string::npos != pos)
  {
    cout << delim << " found in " << s << endl;
  }
}

Fix is very simply. Just replace unsigned int with std::string::size_type. The problem could be avoided if somebody who wrote this program took care of use of correct type. Not to mention that the program would be portable straight away.

I’ve seen this kind of issues quite many times, especially in code written by former C programmers who do not like to wear the muzzle of strictness the C++ types system enforces and requires. The example above is a trivial one, but I believe it presents the root of the problem well.

I recommend brilliant article 64-bit development written by Andrey Karpov where you can find a lot more on the subject.

like image 50
mloskot Avatar answered Oct 05 '22 21:10

mloskot