C++ signed and unsigned int vs long long speed

Q: Is int faster than long long?

The code could be ran on a 16-bit platform with 32-bit long and 16-bit int on which the int would probably be faster - but not necessarily. On the other hand, on a native 32-bit platform which has 32-bit int and 64-bit long , the long could be faster - but not necessarily.

Q: Which is faster int or unsigned int?

While arithmetic "might" be faster using int , one should remember that integer arithmetic is rarely a bottleneck (on modern desktop cpus at least), memory bandwidth on the other hand often is, so for large datasets short might actually give considerably better performance then int.

Q: What is the difference between unsigned long and unsigned long long?

Unsigned long variables are extended size variables for number storage, and store 32 bits (4 bytes). Unlike standard longs unsigned longs won't store negative numbers, making their range from 0 to 4,294,967,295 (2^32 - 1).

Q: Does unsigned int use less memory?

When no negative numbers are required, unsigned integers are well-suited for networking and systems with little memory, because unsigned integers can store more positive numbers without taking up extra memory. New programmers sometimes get signed and unsigned mixed up.

Tags:

c++

performance

bit-manipulation

Today, I noticed that the speed of several simple bitwise and arithmetic operations differs significantly between int, unsigned, long long and unsigned long long on my 64-bit pc.

In particular, the following loop is about twice as fast for unsigned as for long long, which I didn't expect.

int k = 15;
int N = 30;
int mask = (1 << k) - 1;
while (!(mask & 1 << N)) {
    int lo = mask & ~(mask - 1);
    int lz = (mask + lo) & ~mask;
    mask |= lz;
    mask &= ~(lz - 1);
    mask |= (lz / lo / 2) - 1;
}

(full code here)

Here are the timings (in seconds) (for g++ -O, -O2 and -O3):

1.834207723 (int)
3.054731598 (long long)
1.584846237 (unsigned)
2.201142018 (unsigned long long)

These times are very consistent (i.e. a 1% margin). Without -O flag, each one is about one second slower, but the relative speeds are the same.

Is there a clear reason for this? Vectorization might be it for the 32-bit types, but I can't see where the huge difference between long long and unsigned long long comes from. Are some operations just much slower on some types than on others, or is it just a general thing that 64-bit types are slower (even on 64-bit architectures)?

For those interested, this loop loops over all subsets of {1,2,...,30} with exactly 15 elements. This is done by looping (in order) over all integers less than 1<<30 with exactly 15 bits set. For the current case, that's 155117520 iterations. I don't know the source of this snippet anymore, but this post explains some more.

Edit

It seems from the assembly code that the division can be made faster when the type is unsigned. I think that makes sense, because we don't have to account for the sign bit.

Furthermore, the 32-bit operations use movl and other xxxl instructions, while the 64-bit operations use movq and xxxq.

Edit 2

After reading the post I linked, I decided to use the formula given over there:

T k = 15;
T N = 30;
T mask = (1 << k) - 1;
while (!(mask & 1 << N)) {
    T t = mask | (mask - 1);
    mask = (t + 1) | (((~t & -~t) - 1) >> (__builtin_ctz(mask) + 1));
}

This runs in about a third of the time of the code posted above, and uses the same time for all four types.

492

asked Nov 21 '15 20:11

Ragnar

1 Answers

The slowest operation in your code is

mask |= (lz / lo / 2) - 1

32-bit division is significantly faster than 64-bit division. For example, on Ivy Bridge, 32 bit IDIV takes 19-26 clocks while 64 bit IDIV takes 28-103 clocks latency.

The unsigned version is also faster than signed because the division by 2 is simple bit shift in the unsigned case and there is no size expansion calls (CDQ, CQO).

in the unsigned case is simple bit shift while in signed

[1] http://www.agner.org/optimize/instruction_tables.pdf

answered Sep 26 '22 05:09

Andrey Nasonov

Related questions
                            
                                A more efficient way than function reference?
                            
                                Variadic template recursive return type deduction compilation error
                            
                                Must the C++ standard library support classes that are picky about who their friends are?
                            
                                Static constexpr members of same type as class defined (additional details)
                            
                                std::initializer_list type deduction
                            
                                Draw a graph in opengl
                            
                                How to best handle C++ object initialization: empty constructors or pointers?
                            
                                C++: std::vector - It is possible to "slice" a vector?
                            
                                MinGW-W64 linker skipping incompatible files when searching for third-party static libraries
                            
                                Are the fixed width integer types guaranteed to be typedefs for the standard built in types?
                            
                                Range-v3: Use view_facade to provide both const and non-const iterators
                            
                                partial specialization ordering with non-deduced context
                            
                                /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.15' not found
                            
                                How do I include libraries to a project in Visual Studio?
                            
                                Delphi: Calling a C dll function with Debugger takes 15 s without debugger 0.16 s. Why?
                            
                                Is it possible to introspect on methods using Boost Hana?
                            
                                How to avoid code duplication issue in this case?
                            
                                std::equal_range with lambda
                            
                                How to use node.js built in modules from C++ addons
                            
                                Memory footprint of unique_ptr [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With