Why is GRO more efficient?

Tags:

Generic Receive Offload (GRO) is a software technique in Linux to aggregate multiple incoming packets belonging to the same stream. The linked article claims that CPU utilization is reduced because, instead of each packet traversing the network stack individually, a single aggregated packet traverses the network stack.

However, if one looks at the source code of GRO, this feels like a network stack in itself. For example, an incoming TCP/IPv4 packet needs to go through:

eth_gro_receive
inet_gro_receive
tcp_gro_receive

Each function performs decapsulation and looks at respective frame/network/transport headers as would be expected from the "regular" network stack.

Assuming the machine does not perform firewall/NAT or other obviously expensive per-packet processing, what is so slow in the "regular" network stack that the "GRO network stack" can accelerate?

565

asked Nov 16 '17 14:11

user1202136

1 Answers

Short Answer: GRO is done very early in the receive flow so it basically reduces the number of operations by ~(GRO session size / MTU).

More details: The most common GRO function is napi_gro_receive(). It is used 93 times (in kernel 4.14) by almost all networking driver. By using GRO at NAPI level, the driver is doing the aggregation to a large SKB very early, right at the receive completion handler. This means that all the next functions in the receive stack do much less processing.

Here is a nice visual representation of the RX flow for a Mellanox ConnectX-4Lx NIC (sorry this is what I have access to): enter image description here

As you can see, GRO aggregation is at the very bottom of the call stack. You can also see how much work is done afterwards. Imagine how much overhead you'll have if each of these functions would operate on a single MTU.

Hope this helps.

149

answered Sep 20 '22 08:09

Tgilgul

Related questions
                            
                                How to efficiently group pairs based on shared item?
                            
                                How to reliably get the computer's on-board network adapter's MAC address?
                            
                                Server Push vs Client Pull for Agent-Server Topology
                            
                                In Python on Unix, determine if I am using my computer? or idle?
                            
                                connect on "connection less" boost::asio::ip::udp::socket
                            
                                Android Ethernet and Wi-Fi at the same time
                            
                                Listening on multiple ports?
                            
                                Get the properties of the current network connection
                            
                                Async network operations never finish
                            
                                Mesh networking under iOS
                            
                                how to calculate end-to-end delay in this scenario
                            
                                Can't connect to MySQL server on MySQLCC ERROR 1043 Bad Handshake
                            
                                Stackless python network performance degrading over time?
                            
                                Simulate a tcp connection in Go
                            
                                Connecting to MySQL Database over server
                            
                                How does Boost Asio's hostname resolution work on Linux? Is it possible to use NSS?
                            
                                Writing to file using StreamWriter much slower than file copy over slow network
                            
                                Router vs Switch (Network Address Translation) [closed]
                            
                                Linux tcp settings for ipv6 [closed]
                            
                                How to detect if two Golang net.IPNet objects intersect?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why is GRO more efficient?

Tags:

networking

linux-kernel

offloading

user1202136

People also ask

1 Answers

Tgilgul

Recent Activity

Donate For Us