Hello World using the eBPF

eBPF is a in-kernel register-based virtual machine that is capable to executing native-fast JIT-compiled BPF programs inside the Linux kernel with access to a subset of kernel functions and memory.

History of eBPF

BPF (aka. BSD Packet Filter) was originally designed and developed to capture and filter network packets that matched specific rules¹. The process is done in kernel without copying the packet to the user space and then run filters. The BPF then was extended to more than packet filtering, as eBPF, to monitor, trace and monitor the behavior of the kernel²³, and has been exposed to user space.

eBPF provides a way to run mini programs on system and application events, such as IO. It makes the kernel fully programmable and the eBPF instructions are executed by the Linux kernel, which includes an interpreter and a JIT compiler.

Classical BPF arrived in Linux in 1997. A BPF JIT compiler was added in 2011, since Linux 3.0. BPF was first used outside of networking in 2012, when BPF filters was added for seccomp syscall policies. In 2013, the “extended BFP” patchset came out, and been merged, and today’s eBPF has little to do with Berkeley, packets, or filtering.

eBPF Tooling

BCC is a toolkit that makes use of eBPF for creating kernel tracing and manipulation programs, providing a C programming environment for writing kernel BPF code.

Note that BCC requires the linux-headers package available thus on WSL2 (as to Apr, 2021), developers need to manually build the kernel to make it work, with enabling the configuration CONFIG_IKHEADERS=m. The debugfs is also needed and can be mounted by

sudo mount -t debugfs debugfs /sys/kernel/debug

Or, by adding the configuration to /etcd/fstab:

debugfs    /sys/kernel/debug      debugfs  defaults  0 0

bpftrace is another BPF frontend that it ideal for powerful one-liners scripts, as well as ply.

Tracing using eBFP

Dynamic tracing was first created in the 1990s, was first developed for Linux and was finally added to Linux kernel in 2004. Sun developed the DTrace in the Solaris operating system and made it a well-known and in-demand features. Linux supports dynamic instrumentation for both kernel-level functions and user-level functions, via kprobe and uprobe respectively, as demonstrated in the following examples⁴:

probe	affect
`kprobe:vfs_read`	instrument at the beginning of `vfs_read()`
`kretprobe:vfs_read`	instrument at the end of `vfs_read()`
`uprobe:/bin/bash:readline`	instrument at the beginning of user-level function `readline()` in `/bin/bash`
`uretprobe:/bin/bash:readline`	instrument at the end of user-level function `readline()` in `/bin/bash`

Static tracing in Linux is supported by tracepoint (for kernel static instrumentation) and USDT (user-level static defined tracing) for user-level static instrumentation. Static tracing requires extra definition in the source code thus becomes a maintenance burden for the developers. The tracepoint and USDT can be used as the following examples⁴:

`tracepoint:syscalls:sys_enter_open`	instrument the `open(2)` system call
`usdt:/usr/sbin/mysqld:mysql:query_start`	instrument the `query_start` function in MySQL

The tracepoints and USDTs are only activated and running after the BPF program was attached and are deactivated once the BPF program was removed.

References

https://www.tcpdump.org/papers/bpf-usenix93.pdf ↩
A thorough introduction to eBPF, https://lwn.net/Articles/740157/ ↩
BPF: the universal in-kernel virtual machine, https://lwn.net/Articles/599755/ ↩
These examples are referred from the book BFP Performances Tools, by Brendan Gregg. ↩ ↩²

History of eBPF

eBPF Tooling

Tracing using eBFP

References

Comments: