Why do you program in assembly? [closed]

Tags:

I have a question for all the hardcore low level hackers out there. I ran across this sentence in a blog. I don't really think the source matters (it's Haack if you really care) because it seems to be a common statement.

For example, many modern 3-D Games have their high performance core engine written in C++ and Assembly.

As far as the assembly goes - is the code written in assembly because you don't want a compiler emitting extra instructions or using excessive bytes, or are you using better algorithms that you can't express in C (or can't express without the compiler mussing them up)?

I completely get that it's important to understand the low-level stuff. I just want to understand the why program in assembly after you do understand it.

985

asked Apr 26 '09 20:04

Tom Ritter

1 Answers

I think you're misreading this statement:

For example, many modern 3-D Games have their high performance core engine written in C++ and Assembly.

Games (and most programs these days) aren't "written in assembly" the same way they're "written in C++". That blog isn't saying that a significant fraction of the game is designed in assembly, or that a team of programmers sit around and develop in assembly as their primary language.

What this really means is that developers first write the game and get it working in C++. Then they profile it, figure out what the bottlenecks are, and if it's worthwhile they optimize the heck out of them in assembly. Or, if they're already experienced, they know which parts are going to be bottlenecks, and they've got optimized pieces sitting around from other games they've built.

The point of programming in assembly is the same as it always has been: speed. It would be ridiculous to write a lot of code in assembler, but there are some optimizations the compiler isn't aware of, and for a small enough window of code, a human is going to do better.

For example, for floating point, compilers tend to be pretty conservative and may not be aware of some of the more advanced features of your architecture. If you're willing to accept some error, you can usually do better than the compiler, and it's worth writing that little bit of code in assembly if you find that lots of time is spent on it.

Here are some more relevant examples:

Examples from Games

Article from Intel about optimizing a game engine using SSE intrinsics. The final code uses intrinsics (not inline assembler), so the amount of pure assembly is very small. But they look at the assembler output by the compiler to figure out exactly what to optimize.
Quake's fast inverse square root. Again, the routine doesn't have assembler in it, but you need to know something about architecture to do this kind of optimization. The authors know what operations are fast (multiply, shift) and which are slow (divide, sqrt). So they come up with a very tricky implementation of square root that avoids the slow operations entirely.

High-Performance Computing

Outside the domain of games, people in scientific computing frequently optimize the crap out of things to get them to run fast on the latest hardware. Think of this as games where you can't cheat on the physics.

A great recent example of this is Lattice Quantum Chromodynamics (Lattice QCD). This paper describes how the problem pretty much boils down to one very small computational kernel, which was optimized heavily for PowerPC 440's on an IBM Blue Gene/L. Each 440 has two FPUs, and they support some special ternary operations that are tricky for compilers to exploit. Without these optimizations, Lattice QCD would've run much slower, which is costly when your problem requires millions of CPU hours on expensive machines.

If you are wondering why this is important, check out the article in Science that came out of this work. Using Lattice QCD, these guys calculated the mass of a proton from first principles, and showed last year that 90% of the mass comes from strong force binding energy, and the rest from quarks. That's E=mc² in action. Here's a summary.

For all of the above, the applications are not designed or written 100% in assembly -- not even close. But when people really need speed, they focus on writing the key parts of their code to fly on specific hardware.

answered Sep 29 '22 13:09

Todd Gamblin

Related questions
                            
                                Any simple way to log in Android NDK code?
                            
                                String Padding in C
                            
                                strcpy vs. memcpy
                            
                                The difference between asm, asm volatile and clobbering memory
                            
                                What Does This Valgrind Warning Mean? - warning set address range perms
                            
                                Why is CUDA pinned memory so fast?
                            
                                trap representation
                            
                                printf anomaly after "fork()"
                            
                                What do square brackets mean in array initialization in C?
                            
                                How to define a string literal in gcc command line?
                            
                                Are stack variables aligned by the GCC __attribute__((aligned(x)))?
                            
                                memset() or value initialization to zero out a struct?
                            
                                Math constant PI value in C
                            
                                Why does `free` in C not take the number of bytes to be freed?
                            
                                Is TCHAR still relevant?
                            
                                Why does printf("%f",0); give undefined behavior?
                            
                                What does "Objective-C is a superset of C more strictly than C++" mean exactly?
                            
                                Is there any reason to use C instead of C++ for embedded development? [closed]
                            
                                How does this program work?
                            
                                error : storage class specified for parameter

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why do you program in assembly? [closed]

Tags:

performance

c

assembly

low-level

Tom Ritter

People also ask

1 Answers

Todd Gamblin

Recent Activity

Donate For Us