The Network Stack (1)

The Network Stack (1)

Lecture 5, Part 2: Network Stack Implementation Dr Robert N. M. Watson 2020-2021

Memory flow in hardware

32K, 3-4 cycles

CPU L1 Cache

CPU L1 Cache

Ethernet

PCI

NIC

256K, 8-12 cycles

L2 Cache

DDIO

25M 32-40 cycles

Last-Level Cache (LLC)

DRAM up to 256-290

cycles

DRAM

? Key idea: follow the memory

? Historically, memory copying is avoided due to instruction count ? Today, memory copying is avoided due to cache footprint

? Recent Intel CPUs push and pull DMA via the LLC ("DDIO")

? If we differentiate `send' and `transmit', `receive' vs. `deliver', is this a good idea?

? ... it depends on the latency between DMA and processing

2

Memory flow in software

User process

recv(

)

copyout()

send(

)

copyin()

Kernel

Socket/protocol deliver

NIC receive

free() alloc()

network memory allocator

alloc() free()

Socket/protocol send

NIC transmit

NIC

DMA receive

DMA transmit

? Socket API implies one software-driven copy to/from user memory

? Historically, zero-copy VM tricks for socket API ineffective

? Network buffers cycle through the slab allocator

? Receive: allocate in NIC driver, free in socket layer ? Transmit: allocate in socket layer, free in NIC driver

? DMA performs second copy; can affect cache/memory bandwidth

? NB: what if packet-buffer working set is larger than the cache?

3

The mbuf abstraction

socket buffer

netisr work queue

TCP reassembly

queue

network interface

queue

m_len

m_data

struct mbuf

mbuf packet queue

mbuf header

data

packet header

current data

data external storage

pad

VM/ buffercache page

current data

mbuf

m_len m_data mbuf chain

mbuf mbuf

? Unit of work allocation and distribution throughout the stack

? mbuf chains represent in-flight packets, streams, etc.

? Operations: alloc, free, prepend, append, truncate, enqueue, dequeue ? Internal or external data buffer (e.g., VM page) ? Reflects bi-modal packet-size distribution (e.g., TCP ACKs vs data)

? Similar structures in other OSes ? e.g., skbuff in Linux

4

Send/receive paths in the network stack (1/2)

Input path

Output path

Deliver

Application

recv()

send()

Send

Implementation (and sometimes protocol) layers

System call layer Socket layer TCP layer IP layer Link layer Device driver

Receive

recv()

soreceive() sbappend() tcp_reass() tcp_input()

ip_input()

ether_input()

em_intr()

send()

sosend() sbappend()

tcp_send() tcp_output()

ip_output()

ether_output()

em_start() em_entr() NIC

Transmit 5

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download