Quoting from this blogpost: http://www.codesynthesis.com/~boris/blog/2008/10/13/writing-64-bit-safe-code/ <blockquote> This works because a valid memory index can only be in the [0, ~size_t(0)-1] range. The same approach, for example, is used in std::string. </blockquote> So why is <code>~size_t(0)</code> (this should usually equal <code>0xFFFFFFFF</code> in 32-bit systems) not a valid array index? I assume that if you have 32 bits you should be able to reference the whole range [0, 0xFFFFFFFF], no?

<blockquote> IMPORTANT NOTE: The term "memory index" is ambiguous and confusing. The linked article refers strictly to array indexes, not addresses in memory. It is entirely valid for <code>size_t</code> to be incapable of representing all memory addresses, which is why we have the <code>intptr_t</code> type in C99. Of course, this doesn't apply to your workstation, which undoubtedly has a simple Von Neumann type architecture. (The question has since been edited to remove references to "memory indexes".) </blockquote> The C standard guarantees that <code>size_t</code> can hold the size of any array. However, for any array <code>a[N]</code>, the standard guarantees that <code>a + N</code> must be a valid pointer and compare not equal to any pointer to an element of <code>a</code>. Therefore, <code>size_t</code> must be able to represent at least one value larger than any possible array index. Since <code>~(size_t)0</code> is guaranteed to be the maximum <code>size_t</code> value, it is a good choice of sentinel for array indexes. Discussion: <ol> <li>Why is <code>~(size_t)0</code> guaranteed to be the maximum? Because the standard explicitly says so: from §6.5.3.3: "If the promoted type is an unsigned type, the expression <code>~E</code> is equivalent to the maximum value representable in that type minus <code>E</code>." Note that <code>(size_t)-1</code> is guaranteed to also be the maximum by the rules of conversion from signed to unsigned types. Unfortunately, it is not always easy to find the definition for <code>SIZE_MAX</code> on your platform, so <code>(size_t)-1</code> and <code>~(size_t)0</code> are preferred. (Note that this is no longer true if <code>int</code> can represent <code>SIZE_MAX</code>… but this isn't something that would happen in a real system.)</li> <li>What is the size of an array indexed from 0 to ~0? Such an array cannot exist according to the C standard, by the argument outlined at the top of this post.</li> <li>If you <code>malloc(-1)</code>, the resulting memory region would have to start at 0. (FALSE) There are a lot of really bizarre cases which the standard allows but one doesn't encounter in practice. For example, imagine a system where <code>(uintptr_t)-1 > (size_t)-1</code>. The C standard is worded in exactly the way it is because it doesn't just run on your PC, it runs on bizarre little DSPs with Harvard architectures, and it runs on archaic systems with byzantine memory segmenting schemes. There are also some systems of historical interest where <code>NULL</code> pointers do not have the same representation as 0.</li> </ol>

Why is ~size_t(0) (== 0xFFFFFFFF in most 32-bit systems) not a valid array index?

Tags:

Quoting from this blogpost:

http://www.codesynthesis.com/~boris/blog/2008/10/13/writing-64-bit-safe-code/

This works because a valid memory index can only be in the [0, ~size_t(0)-1] range. The same approach, for example, is used in std::string.

So why is ~size_t(0) (this should usually equal 0xFFFFFFFF in 32-bit systems) not a valid array index? I assume that if you have 32 bits you should be able to reference the whole range [0, 0xFFFFFFFF], no?

496

asked Sep 04 '11 22:09

Roland

1 Answers

IMPORTANT NOTE: The term "memory index" is ambiguous and confusing. The linked article refers strictly to array indexes, not addresses in memory. It is entirely valid for size_t to be incapable of representing all memory addresses, which is why we have the intptr_t type in C99. Of course, this doesn't apply to your workstation, which undoubtedly has a simple Von Neumann type architecture. (The question has since been edited to remove references to "memory indexes".)

The C standard guarantees that size_t can hold the size of any array. However, for any array a[N], the standard guarantees that a + N must be a valid pointer and compare not equal to any pointer to an element of a.

Therefore, size_t must be able to represent at least one value larger than any possible array index. Since ~(size_t)0 is guaranteed to be the maximum size_t value, it is a good choice of sentinel for array indexes.

Discussion:

Why is ~(size_t)0 guaranteed to be the maximum? Because the standard explicitly says so: from §6.5.3.3: "If the promoted type is an unsigned type, the expression ~E is equivalent to the maximum value representable in that type minus E." Note that (size_t)-1 is guaranteed to also be the maximum by the rules of conversion from signed to unsigned types. Unfortunately, it is not always easy to find the definition for SIZE_MAX on your platform, so (size_t)-1 and ~(size_t)0 are preferred. (Note that this is no longer true if int can represent SIZE_MAX… but this isn't something that would happen in a real system.)
What is the size of an array indexed from 0 to ~0? Such an array cannot exist according to the C standard, by the argument outlined at the top of this post.
If you malloc(-1), the resulting memory region would have to start at 0. (FALSE) There are a lot of really bizarre cases which the standard allows but one doesn't encounter in practice. For example, imagine a system where (uintptr_t)-1 > (size_t)-1. The C standard is worded in exactly the way it is because it doesn't just run on your PC, it runs on bizarre little DSPs with Harvard architectures, and it runs on archaic systems with byzantine memory segmenting schemes. There are also some systems of historical interest where NULL pointers do not have the same representation as 0.

answered Oct 25 '22 18:10

Dietrich Epp

Related questions
                            
                                How to autoplay HTML5 mp4 video on Android?
                            
                                Updating XCODE using different apple account
                            
                                Add a GET parameter to a POST request with RestSharp
                            
                                xml viewer for linux [closed]
                            
                                xcrun: Error: failed to exec real xcrun. (No such file or directory)
                            
                                Multiple views with different heights in ViewFlipper
                            
                                How do I iterate through the digits of an integer? [duplicate]
                            
                                Angularjs filter negated
                            
                                How to change colour of the thumb in seekbar?
                            
                                Request exceeded the limit of 10 internal redirects
                            
                                image in circle frame iOS [closed]
                            
                                Print Java ENUM to lower case by default keeping enum constants in uppercase

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With