I have been doing some research as to how sine and cosine can be calculated. I found a couple of "standard" methods, including a lookup table, a CORDIC algorithm, and Taylor series. I also found that most modern processors have an assembler instruction calculating trigonometric functions. What I want to know is how those commands work.
So, my question is: What specific algorithm do current gen processors use for calculating sine and cosine?
Typically high resolution sin(x) functions would be implemented with a CORDIC (COrdiate Rotation DIgital Computer) algorithm, which can be accomplished with a small number of iterations using only shifts and add/subtract and a small lookup table.
Calculators don't actually use the Taylor series but the CORDIC algorithm to find values of trigonometric functions. The Cordic algorithm is based on thinking of the angle as the phase of a complex number in the complex plane, and then rotating the complex number by multiplying it by a succession of constant values.
Sine and cosine — a.k.a., sin(θ) and cos(θ) — are functions revealing the shape of a right triangle. Looking out from a vertex with angle θ, sin(θ) is the ratio of the opposite side to the hypotenuse , while cos(θ) is the ratio of the adjacent side to the hypotenuse .
Sine and cosine functions can be used to model many real-life scenarios – radio waves, tides, musical tones, electrical currents.
The answer to a related, but different question here talks of how FPUs perform such instructions:
Once you've reduced your argument, most chips use a CORDIC algorithm to compute the sines and cosines. You may hear people say that computers use Taylor series. That sounds reasonable, but it's not true. The CORDIC algorithms are much better suited to efficient hardware implementation. (Software libraries may use Taylor series, say on hardware that doesn't support trig functions.) There may be some additional processing, using the CORDIC algorithm to get fairly good answers but then doing something else to improve accuracy.
Note though that it says "most chips", as attempts to improve performance, accuracy or (ideally) both would obviously be something that chip manufacturers strive for, and so, there will be differences between them.
Those differences my well lead to greater performance at the cost of less accuracy, or vice-versa (and of course, they can just be plain bad at both, since we live in an imperfect world) so there would be times when one might favour performing the algorithm in the CPU (as would happen if you coded the algorithm yourself) rather than in the FPU like fsin passes to.
This archived blog post talks of how Sun's implementation of the JVM on Intel only uses a plain call to fsin
with inputs of a certain range, because of flaws in that implementation. The paper linked to from that article presumably discusses that implementation of fsin
, and it's issues, in more detail, but you'll need to be a subscriber or pay to read that article (which I have hence not done).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With