Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In C and Objective-C, what really is the right way to truncate a float or double to an integer?

I worked mostly with integers before, and in situations where I need to truncate a float or double to an integer, I would use the following before:

(int) someValue

except until I found out the following:

NSLog(@"%i", (int) ((1.2 - 1) * 10));     // prints 1
NSLog(@"%i", (int) ((1.2f - 1) * 10));    // prints 2

(please see Strange behavior when casting a float to int in C# for the explanation).

The short question is: how should we truncate a float or double to an integer properly? (Truncation is wanted in this case, not "rounding"). Or, we may say that since one number is 1.9999999999999 and the other is 2.00000000000001 (roughly speaking), the truncate is actually done correctly. So the question is, how should we convert a float or double so that the result is a "truncated" number that makes common usage sense?

(the intention is not to use round, because in this case, for 1.8, we do want the result of 1, instead of 2)


Longer question:

I used

int truncateToInteger(double a) {
    return (int) (a + 0.000000000001);
}

-(void) someTest {
    NSLog(@"%i", truncateToInteger((1.2 - 1) * 10));
    NSLog(@"%i", truncateToInteger((1.2f - 1) * 10));
}

and both print out as 2, but it seems too much of a hack, and what small number should we use to "remove the inaccuracy"? Is there a more standard or studied way, instead of such an arbitrary hack?

(Note that we want truncation, not rounding in some usage, for example, say, if the number of seconds is 90 or 118, when we show how many minutes and how many seconds have elapsed, the minute should display as 1, but should not be rounded up to 2)

like image 436
nonopolarity Avatar asked Jun 28 '12 12:06

nonopolarity


People also ask

Does int truncate in C?

It truncates automatically is you assign value to "int" variable: int c; c = a/b; Or you can cast like this: c = (int) (a/b);

How do you truncate a floating point number?

Use the int Function to Truncate a Float in Python The built-in int() function takes a float and converts it to an integer, thereby truncating a float value by removing its decimal places. What is this? The int() function works differently than the round() and floor() function (which you can learn more about here).

How do you round a number in Objective C?

Use lroundf() to round a float to integer and then convert the integer to a string.


4 Answers

The truncate has been performed correctly, of course, but on an inaccurate intermediate value.

In general there's no way to know whether your 1.999999 result is a slightly inaccurate 2 (so the exact-maths result after truncation is 2), or a slightly inaccurate 1.999998 (so the exact-maths result after truncation is 1).

For that matter, for some calculations you could get 2.000001 as a slightly inaccurate 1.999998. Pretty much whatever you do, you'll get that one wrong. Truncation is a non-continuous function, so however you do it, it makes your overall computation numerically unstable.

You could add an arbitrary tolerance anyway: (int)(x > 0 ? x + epsilon : x - epsilon). It may or my not help, depending what you're doing, which is why it's a "hack". epsilon could be a constant, or it could scale according to the size of x.

The most common solution to your second question isn't to "remove the inaccuracy", rather to accept the inaccurate result as if it were accurate. So, if your floating point unit says that (1.2-1)*10 is 1.999999, OK, it is 1.999999. If that value represents a number of minutes then it truncates to 1 minute 59 seconds. Your final displayed result will be 1s off the true value. If you need a more accurate final displayed result than that, then you shouldn't have used floating-point arithmetic to compute it, or perhaps you should have rounded to the nearest second before truncating to minutes.

Any attempt to "remove" inaccuracy from a floating-point number is actually just going to move inaccuracy around - some inputs will give more accurate results, others less accurate. If you're lucky enough to be in a case where the the inaccuracy is shifted to inputs you don't care about, or can filter out before doing the computation, then you win. In general though, if you have to accept any input then you're going to lose somewhere. You need to look at how to make your computation more accurate, rather than trying to remove inaccuracy in a truncation step at the end.

There's a simple correction for your example computation -- use fixed-point arithmetic with one base-10 decimal place. We know that format can accurately represent 1.2. So, instead of writing (1.2 - 1) * 10, you should rescale the computation to use tenths (write (12 - 10) * 10) and then divide the final result by 10 to scale it back to units.

like image 81
Steve Jessop Avatar answered Sep 29 '22 02:09

Steve Jessop


As you have modified your question, the problem now seems to be this: Given some inputs x, you calculate a value f'(x). f'(x) is the calculated approximation to an exact mathematical function f(x). You want to calculate trunc(f(x)), that is, the integer i that is farthest from zero without being farther from zero than f(x) is. Because f'(x) has some error, trunc(f'(x)) might not equal trunc(f(x)), such as when f(x) is 2 but f'(x) is 0x1.fffffffffffffp0. Given f'(x), how can you calculate trunc(f(x))?

This problem is impossible to solve. There is no solution that will work for all f.

The reason there is no solution is that, due to the error in f', f'(x) might be 0x1.fffffffffffffp0 because f(x) is 0x1.fffffffffffffp0, or f'(x) might be 0x1.fffffffffffffp0 because of calculation errors even though f(x) is 2. Therefore, given a particular value of f'(x), it is impossible to know what trunc(f(x)) is.

A solution is possible only given detailed information about f (and the actual operations used to approximate it with f'). You have not given that information, so your question cannot be answered.

Here is a hypothesis: Suppose the nature of f(x) is such that its results are always a non-negative multiple of q, for some q that divides 1. For example, q might be .01 (hundredths of a coordinate value) or 1/60 (represent units of seconds because f is in units of minutes). And suppose the values and operations used in calculating f' are such that the error in f' is always less than q/2.

In this very limited, and hypothetical, case, then trunc(f(x)) can be calculated by calculating trunc(f'(x)+q/2). Proof: Let i = trunc(f(x)). Suppose i > 0. Then i <= f(x) < i+1, so i <= f(x) <= i+1-q (because f(x) is quantized by q). Then i-q/2 < f'(x) < i+1-q+q/2 (because f'(x) is within q/2 of f(x)). Then i < f'(x)+q/2 < i+1. Then trunc(f'(x)+q/2) = i, so we have the desired result. In the case where i = 0, then -1 < f(x) < 1, so -1+q <= f(x) <= 1-q, so -1+q-q/2 < f'(x) < 1-q+q/2, so -1+q < f'(x)+q/2 < 1, so trunc(f'(x)+q/2) = 0.

(Note: If q/2 is not exactly representable in the floating-point precision used or cannot be easily added to f'(x) without error, then some adjustments have to be made in either the proof, its conditions, or the addition of q/2.)

If that case does not serve your purpose, then you cannot expect an answer expect by providing detailed information about f and the operations and values used to calculate f'.

like image 27
Eric Postpischil Avatar answered Sep 29 '22 02:09

Eric Postpischil


The 'hack' is the proper way to do it. It's simple how floats work, if you want more sane decimal behavior NSDecimal(Number) might be what you want.

like image 24
Hampus Nilsson Avatar answered Sep 29 '22 01:09

Hampus Nilsson


NSLog(@"%i", [[NSNumber numberWithFloat:((1.2 - 1) * 10)] intValue]); //2
NSLog(@"%i", [[NSNumber numberWithFloat:(((1.2f - 1) * 10))] intValue]); //2 
NSLog(@"%i", [[NSNumber numberWithFloat:1.8] intValue]); //1
NSLog(@"%i", [[NSNumber numberWithFloat:1.8f] intValue]); //1
NSLog(@"%i", [[NSNumber numberWithDouble:2.0000000000001 ] intValue]);//2
like image 26
Parag Bafna Avatar answered Sep 29 '22 01:09

Parag Bafna