How does this way of calculating large numbers work?

Question

Typically, to handle integers that exceed the range of long long in C++, you'd have to represent them as string and perform operations on them that way. But, I found this code on the internet which seems to work like magic. It calculates any sum of powers of two (without 2^0), even though it can't be stored in a long long.

#include <iostream>  
#include <cmath>  
#include <iomanip>  
#include <sstream>  
using namespace std;

int main() {
    int n;
    stringstream ss;
    cin >> n;

    ss << fixed << setprecision(0) << pow(2, n + 1) - 2;

    if (n >= 54) {
        string a = ss.str();

        a[a.size() - 1] = ((a[a.size() - 1] - 48) - 2) + 48;

        cout << a;
        return 0;
    }

    cout << ss.str();

}

How does it work? Will it work for any operation involving large numbers? If the value of n is very big (I tried 1024) it just prints "inf". What's the upper-limit of the range of numbers that can be calculated this way?

What exactly does the following part do and why does it do it?

a[a.size() - 1] = ((a[a.size() - 1] - 48) - 2) + 48;

eerorika · Accepted Answer

Will it work for any operation involving large numbers?

You can perform same operations with floating point as you can perform with integers. But each calculation involves an error, and not all integers are representable.

What's the upper-limit of the range of numbers that can be calculated this way?

Depends on the type of double precision floating point that your processor uses.

You can find out the highest representable number with std::numeric_limits<double>::max(). However, the precision is very bad at these high numbers. Not all integers can be represented up to this number. The maximum value of continuously representable integers is std::pow(std::numeric_limits<double>::radix, std::numeric_limits<double>::digits).

What exactly does the following part do and why does it do it?
a[a.size() - 1] = ((a[a.size() - 1] - 48) - 2) + 48;

This can be simplified to

a[a.size() - 1] -= 2;

It simply deducts 2 from the last (lowest) digit. It relies on the mathematical fact that a power of 2 is never 0 or 1 modulo 10 (except 2⁰) in which case the last digit would become a non-digit character.

It also relies on the fact that pow(2, n + 1) - 2 == pow(2, n + 1) for n >= 54. The code assumes that the floating point follows the ubiquitous binary IEEE-754 format in which std::pow(std::numeric_limits<double>::radix, std::numeric_limits<double>::digits) is std::pow(2, 54). When n is greater than or equal to 54, the result of calculation std::pow(2, 54 + 1) becomes so large, that if you deduct a small number such 2 from it, the closest representable result is the same that you started with. The accuracy error of the calculation is equal to the smaller operand! That calculation simply cannot be performed with floating point numbers. That is why it is fixed afterwards with digit character fiddling.

All powers of 2 (up to the limit) are representable, so the power calculation itself never has any accuracy error.

AnT · Answer

You are looking at a rather ham-fisted implementation of a relatively simple trick.

It is based on the fact that binary floating-point representation (e.g. IEEE 754 one) can represent 2^N precisely for rather large values of N (the range of the exponent part of the representation).

This means that in a properly implemented standard library by doing this

unsigned N = ...;

double d = std::pow(2.0, N);
std::stringstream str;
str << std::fixed << std::setprecision(0) << d;
std::string s = str.str();

you can obtain the exact decimal representation of 2^N for such large values of N.

Now if you take into account the fact that decimal representation of 2^N (N > 0) never ends in ...0 or an odd digit, you should understand that adding 1 or subtracting 1 or 2 from the resultant decimal representation can only modify its last digit (never produces a carry or borrow). This means that you can calculate 2^N+k for k = -2,-1,0,+1 by simply following the above with

s[s.size() - 1] += k;

If you additionally observe that a power of 2 cannot end in ...98, you should realize that representations for k = +2,+3 can be obtained by

if ((s[s.size() - 1] += k) > '9')
{
  s[s.size() - 1] -= 10;
  ++s[s.size() - 2];
}

since any possible carry will never propagate more than 1 step. (For brevity, I omitted the check for length ).

Similarly, since a power of 2 cannot end in ...02, representations for k = -3,-4 can be obtained by

if ((s[s.size() - 1] += k) < '0')
{
  s[s.size() - 1] += 10;
  --s[s.size() - 2];
}

In other words, in the original code there was no real need to subtract 2 early (in pow(2, n + 1) - 2). And there was no real need to involve that 48 in last digit adjustment expression.

How does this way of calculating large numbers work?

Tags:

c++

stringstream

largenumber

user3071028

2 Answers

eerorika

AnT

Recent Activity

Donate For Us

How does this way of calculating large numbers work?

Tags:

c++

stringstream

largenumber

user3071028

2 Answers

eerorika

AnT

Related questions

Recent Activity

Donate For Us