sk_buff

All network-related queues and buffers in the kernel use a common data structure, struct sk_buff. This is a large struct containing all the control information required for the packet (datagram, cell, whatever). The sk_buff elements are organized as a doubly linked list, in such a way that it is very efficient to move an sk_buff element from the beginning/end of a list to the beginning/end of another list. A queue is defined by struct sk_buff_head, which includes a head and a tail pointer to sk_buff elements.

All the queuing structures include an sk_buff_head representing the queue. For instance, struct sock includes a receive and send queue. Functions to manage the queues (skb_queue_head(), skb_queue_tail(), skb_dequeue(), skb_dequeue_tail()) operate on an sk_buff_head. In reality, however, the sk_buff_head is included in the doubly linked list of sk_buffs (so it actually forms a ring).

When a sk_buff is allocated, also its data space is allocated from kernel memory. sk_buff allocation is done with alloc_skb() or dev_alloc_skb(); drivers use dev_alloc_skb();. (free by kfree_skb() and dev_kfree_skb(). However, sk_buff provides an additional management layer. The data space is divided into a head area and a data area. This allows kernel functions to reserve space for the header, so that the data doesn't need to be copied around. Typically, therefore, after allocating an sk_buff, header space is reserved using skb_reserve(). skb_pull(int len) – removes data from the start of a buffer (skipping over an existing header) by advancing data to data+len and by decreasing len.

struct sk_buff has fields to point to the specific network layer headers:

  • transport_header (previously called h) – for layer 4, the transport layer (can include tcp header or udp header or icmp header, and more)
  • network_header – (previously called nh) for layer 3, the network layer (can include ip header or ipv6 header or arp header).
  • mac_header – (previously called mac) for layer 2, the link layer.
  • skb_network_header(skb), skb_transport_header(skb) and skb_mac_header(skb) return pointer to the header.

The struct sk_buff objects themselves are private for every network layer. When a packet is passed from one layer to another, the struct sk_buff is cloned. However, the data itself is not copied in that case. Note that struct sk_buff is quite large, but most of its members are unused in most situations. The copy overhead when cloning is therefore limited.

  • Almost always sk_buff instances appear as “skb” in the kernel code.
  • struct dst_entry *dst – the route for this sk_buff; this route is determined by the routing subsystem.
    • It has 2 important function pointers:
      • int (*input)(struct sk_buff*);
      • int (*output)(struct sk_buff*);
    • input() can be assigned to one of the following : ip_local_deliver, ip_forward, ip_mr_input, ip_error or dst_discard_in.
    • output() can be assigned to one of the following :ip_output, ip_mc_output, ip_rt_bug, or dst_discard_out.
    • we will deal more with dst when talking about routing.
    • In the usual case, there is only one dst_entry for every skb.
    • When using IPsec, there is a linked list of dst_entries and only the last one is for routing; all other dst_entries are for IPSec transformers ; these other dst_entries have the DST_NOHASH flag set. These entries , which has this DST_NOHASH flag set are not kept in the routing cache, but are kept instead on the flow cache.
  • tstamp (of type ktime_t ) : time stamp of receiving the packet.
    • net_enable_timestamp() must be called in order to get values.
Groups: