Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Overhead of casting double to float?

Tags:

c++

c

So I have megabytes of data stored as doubles that need to be sent over a network... now I don't need the precision that a double offers, so I want to convert these to a float before sending them over the network. What is the overhead of simply doing:

float myFloat = (float)myDouble;

I'll be doing this operation several million times every few seconds and don't want to slow anything down. Thanks

EDIT: My platform is x64 with Visual Studio 2008.

EDIT 2: I have no control over how they are stored.

like image 476
Polaris878 Avatar asked Sep 23 '09 16:09

Polaris878


People also ask

How do you cast a double to a float?

The floatValue() method of Double class is a built-in method to return the value specified by the calling object as float after typecasting. Return Type: It returns a float value i.e. the type casted value of the DoubleObject.

Why use a double over a float?

Double is more precise than float and can store 64 bits, double of the number of bits float can store. Double is more precise and for storing large numbers, we prefer double over float. For example, to store the annual salary of the CEO of a company, double will be a more accurate choice.

Are double and float interchangeable?

float and double both have varying capacities when it comes to the number of decimal digits they can hold. float can hold up to 7 decimal digits accurately while double can hold up to 15. Let's see some examples to demonstrate this.

Can we cast double to float in Java?

The floatValue() method returns the double value converted to type float .


1 Answers

As Michael Burr said, while the overhead strongly depends on your platform, the overhead is definitely less than the time needed to send them over the wire.


a rough estimate:

800MBit/s payload on a excellent Gigabit wire, 25M-floats/second.

On a 2GHz single core, that gives you a whopping 80 clock cycles for each value converted to break even - anythign less, and you will save time. That should be more than enough on all architectures :)

A simple load-store cycle (barring all caching delays) should be below 5 cycles per value. With instruction interleaving, SIMD extensions and/or parallelizing on multiple cores, you are likely to do multiple conversions in a single cycle.

Also, the receiver will be happy having to handle only half the data. Remember that memory access time is nonlinear.


The only thing arguing against the conversion would be is if the transfer should have minimal CPU load: a modern architecture could transfer the data from disk/memory to bus without CPU intervention. However, with above numbers I'd say that doesn't matter in practice.

[edit]
I checked some numbers, the 387 coprocessor would indeed have taken around 70 cycles for a load-store cycle. On the initial pentium, you are down to 3 cycles without any parallelization.

So, unless you run a gigabit network on a 386...

like image 185
peterchen Avatar answered Oct 19 '22 23:10

peterchen