Why is the Graphics Pipeline so highly specialized? (OpenGL)

Tags:

The OpenGL Graphics Pipeline is changing every year. So the programmable Pipelines are growing. At the end, as an opengl Programmer we create many little programms (Vertex, Fragment, Geometry, Tessellation, ..)

Why is there such a high specialization between the stages? Are they all running on a different part of the hardware? Why not just writing one code-block to describe what should be come out at the end instead of juggling between the stages?

http://www.g-truc.net/doc/OpenGL%204.3%20Pipeline%20Map.pdf

In this Pipeline PDF we see the beast.

946

asked May 12 '14 07:05

user1767754

3 Answers

In the days of "Quake" (the game), developers had the freedom to do anything with their CPU rendering implementations, they were in control of everything in the "pipeline".

With the introduction of fixed pipeline and GPUs, you get "better" performance, but lose a lot of the freedom. Graphics developers are pushing to get that freedom back. Hence, more customization pipeline everyday. GPUs are even "fully" programmable now using tech such as CUDA/OpenCL, even if it's not strictly about graphics.

On the other hand, GPU vendors cannot replace the whole pipeline with fully programmable one overnight. In my opinion, this boils down to multiple reasons;

GPU capabilities and cost; GPUs evolve with each iteration, it's nonsense to throw away all the architecture you have and replace it overnight, instead you add new features and enhancements every iteration, especially when developers ask for it (example: Tessellation stage). Think of CPUs, Intel tried to replace the x86 architecture with Itanium, losing backward compatibility, having failed, they eventually copied what AMD did with AMDx64 architecture.
They also can't fully replace it due to legacy applications support, which are more widely used than someone might expect.

answered Sep 22 '22 15:09

concept3d

Historically, there were actually different processing units for the different programmable parts - there were Vertex Shader processors and Fragment Shader processors, for example. Nowadays, GPUs employ a "unified shader architecture" where all types of shaders are executed on the same processing units. That's why non-graphic use of GPUs such as CUDA or OpenCL is possible (or at least easy).

Notice that the different shaders have different inputs/outputs - a vertex shader is executed for each vertex, a geometry shader for each primitive, a fragment shader for each fragment. I don't think this could be easily captured in one big block of code.

And last but definitely far from least, performance. There are still fixed-function stages between the programmable parts (such as rasterisation). And for some of these, it's simply impossible to make them programmable (or callable outside of their specific time in the pipeline) without reducing performance to a crawl.

answered Sep 21 '22 15:09

Angew is no longer proud of SO

Because each stage has a different purpose

Vertex is to transform the points to where they should be on the screen

Fragment is for each fragment (read: pixel of the triangles) and applying lighting and color

Geometry and Tessellation both do things the classic vertex and fragment shaders cannot (replacing the drawn primitives with other primitives) and are both optional.

If you look carefully at that PDF you'll see different inputs and outputs for each shader/

answered Sep 20 '22 15:09

ratchet freak

Related questions
                            
                                C++: Displaying characters
                            
                                Does the unsigned keyword affect the result of sizeof?
                            
                                Keep track of how many times a recursive function has been called in C++
                            
                                The reason why not able to use polymorphism with values but references and pointers
                            
                                qml and c++ with qt quick 2 application
                            
                                using vector::erase for the whole range
                            
                                Pattern to register metatypes in Qt
                            
                                Limits of BOOST_FUSION_ADAPT_STRUCT
                            
                                no end of line in boost property tree xml writer output
                            
                                why my format doesn't work in boost log
                            
                                FFMPEG audio transcoding using libav* libraries
                            
                                How to get dereferenced type of template member for function return type
                            
                                Why is g++ allowing me to treat this void-function as anything but?
                            
                                Is it worth it to avoid polymorphism in order to gain performance?
                            
                                What is the reason for a joinable std::thread not join automatically?
                            
                                Add time stamp with std::cout
                            
                                error: no match for ‘operator<’ in ‘__x < __y’ when trying to insert in two map
                            
                                Iterate over template classes in c++ 11
                            
                                Calculate the Fibonacci number (recursive approach) in compile time (constexpr) in C++11
                            
                                c++ can't get "wcout" to print unicode, and leave "cout" working

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With