Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Track a packet as it goes through the kernel (linux)

I have two machines which are set up to use Ip-Security and machine A (lets call them A and B) has a socket which is bound to a particular UDP port on the local machine and it polls it frequently to see if anything is received on it.

When I disable Ip-security, the data between the two machines goes through fine and I send and receive the packets fine. But when Ip-Security is enabled, the packet doesn't get to that socket on machine A sent by machine B.

I do a tcpdump on both the machines and I can see the (encrypted) packet being sent out from machine B and being received on machine A. But after that, the packet goes to the kernel and somewhere either in the decryption of the packet or at some other phase, the packet is dropped.

I want to be able to trace the packet as it goes through the kernel and to see where it is dropped. Is there some /proc that I can use for this purpose? The other method I can think of is to insert debug statements all over the kernel and recompile it and then try sending the packet again and going through the debug.

Thanks and sorry for the long message but it was necessary.

like image 251
mmtauqir Avatar asked Nov 03 '11 17:11

mmtauqir


People also ask

What is kernel trace?

Kernel tracing is an analysis activity in which the instrumented kernel on a target logs information about kernel events. This information shows what's happening on the target at a system-wide level, so you can eliminate performance bottlenecks and optimize the interaction of processes and threads.

Is TCP implemented in kernel?

Modern operating systems use the TCP/IP stack. Their kernel implements protocols up to the transport layer, while application layer protocols are typically implemented in user space (HTTP, FTP, SSH, etc.).

What does kernel bypass mean?

Kernel-bypass networking eliminates the overheads of in-kernel network stacks by moving protocol processing to userspace. The packet I/O is either handled by the hardware, the OS, or by userspace, depending on the specific kernel-bypass architecture in use.

How does Linux network stack work?

In the Linux operating system, the network stack is useful to communicate the application with the physical network devices. The network stack is divided into multiple layers. There are different network layers. The same different roles are playing their individual role.


1 Answers

Yes, as Dan said, SystemTap is useful. But my favorite is ftrace.

For reference:

Path of UDP packet in linux kernel

So for tracing the network traffic in general, put the following in a bash shell and run it as root:

mkdir /debug
mount -t debugfs nodev /debug
mount -t debugfs nodev /sys/kernel/debug
echo '*' >/debug/tracing/set_ftrace_filter
echo function_graph >/debug/tracing/current_tracer
echo 1 >/debug/tracing/tracing_on
sleep 20
echo 0 >/debug/tracing/tracing_on
cat /debug/tracing/trace > /tmp/tracing.out$$

And so on the receiving ingress path:

 5)               |              tcp_recvmsg() {
 5)               |                lock_sock_nested() {
 5)   0.042 us    |                  _cond_resched();
 5)               |                  _raw_spin_lock_bh() {
 5)   0.040 us    |                    local_bh_disable();
 5)   0.414 us    |                  }
 5)   0.040 us    |                  _raw_spin_unlock();
 5)   0.040 us    |                  local_bh_enable();
 5)   1.814 us    |                }
 5)               |                skb_copy_datagram_iovec() {
 5)   0.042 us    |                  _cond_resched();
 5)   0.588 us    |                }
 5)   0.042 us    |                tcp_rcv_space_adjust();
 5)               |                __kfree_skb() {
 5)               |                  skb_release_all() {
 5)               |                    skb_release_head_state() {
 5)   0.044 us    |                      sock_rfree();
 5)   0.670 us    |                    }
 5)               |                    skb_release_data() {
 5)               |                      put_page() {
 5)   0.049 us    |                        put_compound_page();
 5)   0.449 us    |                      }

and this:

Netlink processing:

 6)               |          rtnetlink_rcv() {
 6)               |            mutex_lock() {
 6)   0.090 us    |              _cond_resched();
 6)   1.455 us    |            }
 6)               |            netlink_rcv_skb() {
 6)               |              rtnetlink_rcv_msg() {
 6)   0.150 us    |                mutex_unlock();
 6)               |                __netlink_dump_start() {
 6)               |                  netlink_lookup() {
 6)   0.091 us    |                    _raw_read_lock();
 6)   0.100 us    |                    netlink_compare();
 6)   1.791 us    |                  }
 6)               |                  mutex_lock() {
 6)   0.095 us    |                    _cond_resched();
 6)   0.913 us    |                  }
 6)   0.100 us    |                  try_module_get();
 6)   0.090 us    |                  mutex_unlock();

and this is the ingress also:

 3)               |                                                tcp_v4_rcv() {
 3)               |                                                  sk_filter() {
 3)               |                                                    security_sock_rcv_skb() {
 3)   0.076 us    |                                                      cap_socket_sock_rcv_skb();
 3)   0.867 us    |                                                    }
 3)   1.630 us    |                                                  }
 3)   0.076 us    |                                                  _raw_spin_lock();
 3)   0.477 us    |                                                  tcp_prequeue();
 3)               |                                                  tcp_v4_do_rcv() {
 3)   0.088 us    |                                                    tcp_md5_do_lookup();
 3)   0.109 us    |                                                    tcp_parse_md5sig_option();
 3)   0.072 us    |                                                    ipv4_dst_check();
 3)               |                                                    tcp_rcv_established() {
 3)   0.076 us    |                                                      tcp_parse_aligned_timestamp.part.34();
 3)               |                                                      tcp_queue_rcv() {
 3)               |                                                        tcp_try_coalesce.part.41() {
 3)   0.835 us    |                                                          skb_try_coalesce();
 3)   1.722 us    |                                                        }
 3)   2.637 us    |                                                      }

And this is the egress (starting from the syscall "sendmsg()"):

 5)               |  SyS_sendmsg() {
 5)               |    __sys_sendmsg() {
 5)               |      sockfd_lookup_light() {
 5)   0.080 us    |        fget_light();
 5)   0.502 us    |      }
 5)               |      ___sys_sendmsg() {
 5)   0.117 us    |        copy_msghdr_from_user();
 5)   0.101 us    |        verify_iovec();
 5)               |        sock_sendmsg() {
 5)               |          security_socket_sendmsg() {
 5)               |            apparmor_socket_sendmsg() {
 5)   0.092 us    |              aa_revalidate_sk();
 5)   0.580 us    |            }
 5)   1.044 us    |          }
 5)               |          unix_stream_sendmsg() {
 5)   0.113 us    |            wait_for_unix_gc();
 5)               |            security_socket_getpeersec_dgram() {
 5)   0.044 us    |              apparmor_socket_getpeersec_dgram();
 5)   0.479 us    |            }
 5)               |            sock_alloc_send_pskb() {
 5)               |              __alloc_skb() {
 5)               |                kmem_cache_alloc_node() {
 5)   0.042 us    |                  _cond_resched();
 5)   0.648 us    |                }
 5)               |                __kmalloc_reserve.isra.27() {
 5)               |                  __kmalloc_node_track_caller() {
 5)   0.074 us    |                    kmalloc_slab();
 5)   0.040 us    |                    _cond_resched();
 5)   0.504 us    |                    __slab_alloc();
 5)   1.878 us    |                  }
 5)   2.276 us    |                }
 5)   0.175 us    |                ksize();
 5)   4.217 us    |              }

Hope you enjoy....

like image 110
Peter Teoh Avatar answered Oct 11 '22 12:10

Peter Teoh