What Is EBPF
What Is EBPF
What Is EBPF
Liz Rice
The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. What Is eBPF?,
the cover image, and related trade dress are trademarks of O’Reilly Media, Inc.
The views expressed in this work are those of the author and do not represent the
publisher’s views. While the publisher and the author have used good faith efforts
to ensure that the information and instructions contained in this work are accurate,
the publisher and the author disclaim all responsibility for errors or omissions,
including without limitation responsibility for damages resulting from the use of
or reliance on this work. Use of the information and instructions contained in this
work is at your own risk. If any code samples or other technology this work contains
or describes is subject to open source licenses or the intellectual property rights of
others, it is your responsibility to ensure that your use thereof complies with such
licenses and/or rights.
This work is part of a collaboration between O’Reilly and Isovalent. See our state‐
ment of editorial independence.
978-1-492-09723-5
[LSI]
Table of Contents
1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Extended Berkeley Packet Filter 2
eBPF-Based Tools 2
3. eBPF Programs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Kernel and User Space Code 13
Custom Programs Attached to Events 14
eBPF Maps 16
Opensnoop Example 17
4. eBPF Complexity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Portability Across Kernels 25
Linux Kernel Knowledge 27
Coordinating Multiple eBPF Programs 28
iii
6. eBPF Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Networking 35
Observability 39
Security 41
7. Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
iv | Table of Contents
CHAPTER 1
Introduction
In the last couple of years, eBPF has gone from relative obscurity to
one of the hottest technology areas in modern infrastructure com‐
puting. Personally, I’ve been excited about the possibilities that eBPF
enables ever since seeing Thomas Graf speak about it in a Black Belt
session at DockerCon 17.1 At the Cloud Native Computing Founda‐
tion (CNCF), my colleagues on the Technical Oversight Committee
put eBPF forward as one of the areas to watch in our predictions
of the technologies that would take off in 2021. Over 2,500 signed
up for that year’s eBPF Summit virtual conference, and several of
the world’s most advanced software engineering companies came
together to create the eBPF Foundation. Clearly, there is a lot of
interest in this technology.
In this short report, I hope to give you some insight into why
people are so excited about eBPF and the capabilities it offers for
tooling in modern compute environments. You’ll get a mental model
for what eBPF is and why it’s so powerful. There are some code
examples to help make it more concrete (but you can skip over
these if you prefer). You’ll get an understanding of what’s involved
when building eBPF-enabled tools, and why eBPF has become so
seemingly ubiquitous in such a short period of time.
1 Thomas Graf, “Cilium: Network and Application Security with BPF and XDP”
(DockerCon 17, April 17–20).
1
Inevitably, in this short report there isn’t room to go into all the
details, but I’ll leave you with some pointers for more information if
you want to dive in more deeply.
eBPF-Based Tools
As you’ll see in this report, the ability to dynamically change the
behavior of the kernel is tremendously useful. Traditionally, if we
want to observe how our applications are behaving, we add code
2 Steven McCanne and Van Jacobson, “The BSD Packet Filter: A New Architecture for
User-Level Packet Capture” (working paper, Lawrence Berkeley National Laboratory,
Berkeley, December 19, 1992).
2 | Chapter 1: Introduction
into those apps to generate logs and traces. eBPF allows us to collect
customized information about how an app is behaving without hav‐
ing to change the app in any way, by observing it from within the
kernel. We can build on this observability to create eBPF security
tools that detect or even prevent malicious activity from within the
kernel. And we can create powerful, high-performance networking
capabilities with eBPF, handling network packets within the kernel
and avoiding costly transitions to and from user space.
The concept of observing applications from the kernel’s perspective
isn’t entirely new—it builds on older Linux features, such as perf,3
which also collects behavior and performance information from
within the kernel without having to modify the applications being
measured. But these tools define a scope for the kinds of data that
can be collected, and the formats in which the data is made avail‐
able. With eBPF, we have far more flexibility because we can write
entirely custom programs, allowing us to build a wide range of tools
for different purposes.
eBPF programming is incredibly powerful, but it’s also complex.
For most of us, the utility of eBPF is going to come not from
writing programs ourselves but from using tools created by others.
There are an increasing number of projects and vendors building on
the eBPF platform to create a new generation of tooling, covering
observability, security, networking, and more.
I’ll discuss some more of these higher-level tools later in this report,
but if you’re comfortable on the Linux command line and can’t wait
to see eBPF in action, a great place to start is the BCC project.
It includes a huge collection of tracing tools; even just glancing at
the list should give you some idea of the vast scope of operations
we can instrument with eBPF, including file operations, memory
usage, CPU stats, and even observing any bash command entered
anywhere in the system.
In the next chapter, we’ll look at why changing the kernel’s behavior
is useful, and why eBPF makes it vastly easier to do this than writing
kernel code directly.
eBPF-Based Tools | 3
CHAPTER 2
Changing the Kernel Is Hard
Since eBPF allows running custom code in the Linux kernel, let’s
make sure you’re up to speed on what the kernel does. Then we can
cover why eBPF changes the game when it comes to modifying how
the kernel behaves.
5
liz@liz-ebpf-demo-1:~$ strace -c cat liz.txt
hello
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- -------------
0.00 0.000000 0 5 read
0.00 0.000000 0 1 write
0.00 0.000000 0 21 close
0.00 0.000000 0 20 fstat
0.00 0.000000 0 23 mmap
0.00 0.000000 0 4 mprotect
0.00 0.000000 0 2 munmap
0.00 0.000000 0 3 brk
0.00 0.000000 0 4 pread64
0.00 0.000000 0 1 1 access
0.00 0.000000 0 1 execve
0.00 0.000000 0 2 1 arch_prctl
0.00 0.000000 0 1 fadvise64
0.00 0.000000 0 19 openat
------ ----------- ----------- --------- --------- -------------
100.00 0.000000 107 2 total
Because applications rely so heavily on the kernel, it means that we
can learn a lot about how an application behaves if we can observe
its interactions with the kernel. For example, if you are able to inter‐
cept the system call for opening files, you can see exactly which files
any application accesses. But how could you do that interception?
Let’s consider what would be involved if we wanted to modify the
kernel, adding new code to create some kind of output whenever
that system call is invoked.
1 “Linux 5.12 Coming in at Around 28.8 Million Lines,…” Phoronix (March 2021).
2 Yujuan Jiang et al., “Will My Patch Make It? And How Fast?,” (paper, 2013). According
to this research paper, 33% of patches are accepted, and most take 3–6 months.
3 Thankfully, security patches to existing functionality get made available more quickly.
Kernel Modules
If you don’t want to wait for years for your change to make it into
the kernel, there is another option. The Linux kernel was designed
to accept kernel modules, which can be loaded and unloaded on
demand. If you want to change or extend kernel behavior, writing
a module is certainly one way to do it. In our example of instru‐
menting the system call for opening files, you could write a kernel
module to do this.
The biggest challenge here is that this is still full-on kernel program‐
ming. Users have historically been very cautious about using kernel
modules for one simple reason: if kernel code crashes, it takes down
the machine and everything running on it. How can a user be
confident that a kernel module is safe to run?
Being “safe to run” doesn’t just mean not crashing—the user wants
to know that a kernel module is safe from a security perspective.
Does it include vulnerabilities that an attacker could exploit? Do
we trust the authors of the module not to put malicious code in it?
Because the kernel is privileged code, it has access to everything on
the machine, including all the data, so malicious code in the kernel
would be a serious cause for concern. This applies to kernel modules
too.
13
At a minimum, something in user space needs to load the program
into the kernel and attach it to the right event. There are utilities
such as bpftool to help with this, but these are low-level tools
that assume detailed knowledge of eBPF and are designed more for
eBPF specialists than for the average user. In most eBPF-based tools,
there is a user space application that takes care of loading the eBPF
program into the kernel, passes in any configuration parameters,
and displays information collected by the eBPF program in a user-
friendly way.
The user space part of an eBPF tool can, at least in theory, be written
in any language, though in practice there are libraries to support this
in a fairly small set of languages: C, Go, Rust, and Python among
them. This language choice is further complicated because not all
languages have libraries that support libbpf, which has become a
popular option for making eBPF programs portable across different
versions of the kernel. (We’ll discuss libbpf in Chapter 4.)
Figure 3-1. A user space application uses the bpf() system call to load
eBPF programs from an ELF file into the kernel
2 It’s also possible to skip the object file and load bytecode directly into the kernel using
the bpf() system call.
Tracepoints
You can also attach eBPF programs to tracepoints4 defined within
the kernel. Find the events on your machine by looking under /sys/
kernel/debug/tracing/events.
Perf Events
Perf5 is a subsystem for collecting performance data. You can hook
eBPF programs to all the places where perf data is collected, which
can be determined by running perf list on your machine.
eBPF Maps
The development of maps is one of the significant differences that
justify the e for extended, in the eBPF acronym.
Maps are data structures that are defined alongside eBPF programs.
There are a variety of different types of maps, but they are all
essentially key–value stores. eBPF programs can read and write to
them, as can user space code. Common uses for maps include:
6 If you’re interested in seeing a concrete example of this, you might like to watch my talk
at eBPF Summit 2021 where I implement a very basic load balancer in a few minutes, as
an illustration of how we can use eBPF to change the way the kernel handles network
packets.
If both the kernel and user space code will access the same map,
they will need a common understanding of the data structures
stored in that map. This can be done by including header files that
define those data structures in both the user space and kernel code,
but if these aren’t written in the same language, the author(s) will
need to carefully create structure definitions that are byte-for-byte
compatible.
We’ve discussed the main constituents of an eBPF tool: eBPF pro‐
grams that run in the kernel, user space code to load and interact
with those programs, and maps that allow programs to share data.
To make things concrete, let’s look at an example.
Opensnoop Example
For this example of an eBPF program, I’ve chosen opensnoop, a
utility that shows you what files any process opens. The original
version of this utility was one of many BPF tools that Brendan
Gregg originally wrote in the BCC project which you can find on
GitHub. It was later rewritten for libbpf (which you’ll meet in
the next chapter), and in this example I’m using the newer version
under the libbpf-tools directory.
When you run opensnoop, the output you’ll see depends a lot on
what’s happening on the virtual machine at the time, but it should
look something like this:
PID COMM FD ERR PATH
93965 cat 3 0 /etc/ld.so.cache
93965 cat 3 0 /lib/x86_64-linux-gnu/libc.so.6
93965 cat 3 0 /usr/lib/locale/locale-archive
93965 cat 3 0 /usr/share/locale/locale.alias
...
Each line of output indicates that a process opened (or attempted to
open) a file. The columns show the process ID, the command being
Opensnoop Example | 17
run, the file descriptor, an indication of any error code, and the path
of the file being opened.
Opensnoop works by attaching eBPF programs to the open() and
openat() system calls that any application has to make to ask the
kernel to open a file. Let’s dig in to see how this is implemented.
For brevity, we won’t look at every line of the code, but I hope it’s
sufficient to give you an idea of how it works. (Feel free to skip to
the next chapter if you’re not interested in diving this deep!)
7 At the time of writing, this code uses a perf buffer for the events map. If you were
writing this code today for recent kernels, you would get better performance from a
ring buffer, which is a newer alternative.
There are two different system calls for opening files:8 openat()
and open(). They are identical except that openat() has an extra
argument for a directory file descriptor, and the path name for the
file to be opened is taken relative to that directory. Likewise, the
two functions in opensnoop are identical except for handling this
difference in the arguments.
As you can see, they both take a parameter that is a pointer
to a structure called trace_event_raw_sys_enter. You’d find the
definition for this structure in the vmlinux header file generated
for the particular kernel you’re running on. The art of writing
eBPF programs includes working out what structure each program
receives as its context, and how to access the information within it.
8 In some kernels you’ll also find openat2(), but this isn’t handled in this version of
opensnoop, at least at time of writing.
Opensnoop Example | 19
These two functions use a BPF helper function to retrieve the ID of
the process that’s calling this syscall:
u64 id = bpf_get_current_pid_tgid();
The code gets the filename and any flags that were passed to the
syscall, and puts them in a structure called args:
args.fname = (const char *)ctx->args[0];
args.flags = (int)ctx->args[1];
This structure is written into the start map using the current pro‐
cess ID as the key:
bpf_map_update_elem(&start, &pid, &args, 0);
And that’s all that the eBPF programs do on entry to the syscall. But
there’s another pair of eBPF programs defined in opensnoop.bpf.c
that get triggered when the syscalls exit:
SEC("tracepoint/syscalls/sys_exit_open")
int tracepoint__syscalls__sys_exit_open
This program and its openat() twin share common code in the
function trace_exit(). Have you noticed that all the functions
called by eBPF programs are prefixed by static __always_inline?
That forces the compiler to put the instructions for these functions
inline, because in older kernels a BPF program is not allowed to
jump to a separate function. Newer kernels and versions of LLVM
can support noninlined function calls, but this is a safe way to
ensure the BPF verifier stays happy. (Nowadays there is also the
concept of a BPF tail call, where execution jumps from one BPF
program to another. You can read more about BPF function calls
and tail calls in the eBPF documentation.)
The trace_exit() function creates an empty event structure:
struct event event = {};
The event structure gets written into the events perf buffer map:
bpf_perf_event_output(ctx, &events, BPF_F_CURRENT_CPU,
&event, sizeof(event));
The user space code reads event information out of this map. Before
we get to that, let’s look briefly at the Makefile.
libbpf-tools Makefile
When you build eBPF code, you get an object file containing the
binary definitions of the eBPF programs and maps. You also need an
additional user space executable that will load those programs and
maps into the kernel, and act as the interface for the user.9 Let’s look
at the Makefile that builds opensnoop to see how it creates both the
eBPF object file and the executable.
Makefiles comprise a set of rules, and the syntax for these can be
a bit opaque, so if you’re not familiar with Makefiles and don’t
particularly care about the details, please do feel free to skip over
this section!
9 You could use a general-purpose tool like bpftool, which can read BPF object files and
perform operations on them, but that requires the user to know details about what to
load and what events to attach programs to. For most applications, it makes sense to
write a specific tool that simplifies this for the end user.
Opensnoop Example | 21
The opensnoop example that we’re looking at is one of a large set
of example tools that are all built using one Makefile that you’ll
find in the libbpf-tools directory. Not everything in this file is
particularly of interest, but there are a few rules I’d like to highlight.
The first is a rule that takes a bpf.c file and uses the clang compiler
to create a BPF target object file:
$(OUTPUT)/%.bpf.o: %.bpf.c $(LIBBPF_OBJ) $(wildcard %.h) $(AR..
$(call msg,BPF,$@)
$(Q)$(CLANG) $(CFLAGS) -target bpf -D__TARGET_ARCH_$(ARCH) \
-I$(ARCH)/ $(INCLUDES) -c $(filter %.c,$^) -o $@ && \
$(LLVM_STRIP) -g $@
Finally, there is a rule that uses cc to link the user space application
objects (in our case, opensnoop.o) into a set of executables:
$(APPS): %: $(OUTPUT)/%.o $(LIBBPF_OBJ) $(COMMON_OBJ) | $(OUT...
$(call msg,BINARY,$@)
$(Q)$(CC) $(CFLAGS) $^ $(LDFLAGS) -lelf -lz -o $@
Now that you have seen how the eBPF and user space programs are
generated separately, let’s look at the user space code.
Opensnoop Example | 23
The user space code can retrieve this information, format it and
write it out for the user to see.
As you’ve seen, opensnoop registers eBPF programs that are called
every time any application calls the open() or openat() system call.
These eBPF programs running in the kernel collect information
about the context of that system call—the executable name and
process ID—and about the file being opened. This information is
written into a map, from which user space can read it and display it
to the user.
You’ll find dozens more examples of eBPF tools like this in the
libbpf-tools directory, each of which typically instruments one
syscall, or a family of related syscalls like open() and openat().
System calls are a stable kernel interface, and they offer a very
powerful way to observe what’s happening on a (virtual) machine.
But don’t be fooled into thinking that eBPF programming begins
and ends at intercepting system calls. There are plenty of other sta‐
ble interfaces, including LSM and various points in the networking
stack, to which eBPF can be attached. If you’re willing to risk or
work around changes between kernel versions, the range of places
where you can attach eBPF programs is absolutely vast.
25
the compilation to complete before the tool starts. You also have to
hope that the kernel headers are present on the filesystem (and that’s
not always the case). Enter BPF CO-RE.
CO-RE
The CO-RE—compile once, run everywhere—approach consists of
a few elements:
BTF (BPF Type Format)
This is a format for expressing the layout of data structures and
function signatures. Modern Linux kernels support BTF, so that
you can generate a header file called vmlinux.h from a running
system, containing all the data structure information about a
kernel that a BPF program might need.
libbpf, the BPF library
On the one hand, libbpf provides functions for loading eBPF
programs and maps into the kernel. But it also plays an impor‐
tant role in portability: it leans on BTF information to adjust
the eBPF code to compensate for any differences between the
data structures present when it was compiled, and what’s on the
destination machine.
Compiler support
The clang compiler was enhanced so that when it compiles
eBPF programs, it includes what are known as BTF relocations,
which are what libbpf uses to know what to adjust as it loads
BPF programs and maps into the kernel.
Optionally, a BPF skeleton
A skeleton can be autogenerated from a compiled BPF object
file using bpftool gen skeleton, containing handy functions
that user space code can call to manage the lifecycle of BPF pro‐
grams—loading them into the kernel, attaching them to events
and so on. These functions are higher-level abstractions that
can be more convenient for the developer than using libbpf
directly.
2 Some projects take the approach of packaging the eBPF source plus the required
toolchain into a container image. This avoids the complexity of installing that toolchain
and any concomitant dependency management, but it still means that the compilation
step runs on the destination machine.
4 Rex Guo and Junyuan Zeng, “Phantom Attack: Evading System Call Monitoring,” (DEF
CON, August 5–8, 2021).
5 The Cilium documentation describes how eBPF programs attached to different net‐
working hooks are combined to achieve complex networking capabilities.
1 This is nearly always true, unless you are using a virtualization approach like Kata
containers, Firecracker or unikernels, where each “container” runs in its own virtual
machine.
31
Figure 5-1. All the containers on the same host share a single kernel
Now that you’ve learned about what eBPF is, and something of how
eBPF programs work, let’s turn to exploring some of the tools, built
on this technology, that you might make use of in a production
deployment today. We’ll consider some examples of eBPF-based
open source projects that provide capabilities in three important
areas: networking, observability, and security.
Networking
eBPF programs can be attached to network interfaces and to various
points in the kernel’s network stack. At each point, they can drop
packets, send them to different destinations, or even modify the
contents. This enables some very powerful capabilities. Let’s look
at a few networking features that are now commonly implemented
with eBPF.
Load Balancing
If you have any doubts about the scalability of eBPF for networking,
know that it is being used at massive scale at Facebook. They were
an early adopter of BPF and introduced Katran in 2018, an open
source, layer 4 load balancer.
Another example of a highly scaled load balancer comes from
Cloudflare’s Unimog edge load balancer. By running within the ker‐
nel, eBPF programs can manipulate network packets and forward
35
them to an appropriate destination, without each packet having to
pass through the networking stack and on to user space.
The Cilium project is better known as an eBPF Kubernetes network‐
ing plug-in (as I’ll discuss in a moment) but it’s also in use in large
telecommunication and on-premises deployments as a standalone
load balancer. Again, the ability to process packets at an early stage
without them having to transition into user space makes this highly
performant.
Kubernetes Networking
CNCF project Cilium was the original eBPF-based CNI implemen‐
tation. It was originally started by a group of kernel maintainers
working on eBPF who recognized the potential for its use in cloud
native networking. It’s now used as the default data plane for Google
Kubernetes Engine, Amazon EKS Anywhere, and Alibaba Cloud.
In a cloud native world, pods stop and start all the time, and each
pod gets assigned an IP address. Prior to eBPF-enabled networking,
each node had to keep updating a set of iptables rules for each
of these changes in order to route between pods; and managing
these iptables rules gets unwieldy at scale. As illustrated in Fig‐
ure 6-1, Cilium dramatically simplifies routing so that it’s essentially
a simple lookup table in eBPF, leading to measurable performance
improvements.
Another Kubernetes CNI that added an eBPF implementation,
alongside their traditional iptables version, is Project Calico.
Networking | 37
Service Mesh
eBPF also makes real sense as the basis for a more efficient data
plane for service mesh. Many service mesh features operate at layer
7, the application layer, and use a proxy component such as Envoy
to act on behalf of an application. In Kubernetes, these proxies are
often deployed in a sidecar model, with one proxy container per
pod, so that the proxy has access to the pod’s network namespace.
As you saw in Chapter 5, eBPF allows a more efficient approach
than the sidecar model. Since the kernel has access to all the pod
namespaces, we can use eBPF to make connections between applica‐
tions in pods and a single proxy on the host, as shown in Figure 6-2.
Observability
As you’ve seen earlier in this report, eBPF programs can get visibil‐
ity into everything that’s happening on a machine. By collecting data
about events and passing them to user space, eBPF enables a range
of powerful observability tools that can show you how your appli‐
cations are performing and behaving, without having to make any
changes to instrument those apps. eBPF also enables observability
over the entire system, not just individual applications, so you can
understand the behavior of your host machines.
You’ve come across the BCC project earlier in this report, and over
several years Brendan Gregg has done pioneering work at Netflix to
show how these eBPF tools can be used to observe practically any
metrics you’re interested in, at scale and with high performance.
Kinvolk’s Inspektor Gadget takes some of these tools with their
origins in BCC into the world of Kubernetes, so that you can easily
observe specific workloads on the command line.
A new generation of projects and tools is building on this work to
provide GUI-based observability. CNCF project Pixie lets you run
prewritten or custom scripts and see metrics and logs through a
powerful and visually appealing UI. Because it’s based on eBPF, this
means you can automatically instrument all your applications and
get performance data without making any code changes or configu‐
ration. Figure 6-3 shows just one example of the many visualizations
available in Pixie.
Another observability project called Parca focuses on continuous
profiling, using eBPF to efficiently sample metrics like CPU usage
that you can use to detect performance bottlenecks.
Observability | 39
Figure 6-3. A Pixie flamegraph of everything running on a small
Kubernetes cluster
Security
There are powerful cloud native tools available that enhance secu‐
rity by using eBPF to detect and even prevent malicious activity.
I’ve considered these in two groups: securing network activity and
securing the expected behavior of applications at runtime.
Network Security
Because eBPF allows inspecting and manipulating network packets,
it has many uses in network security. The basic principle is that if
a network packet is deemed to be malicious or problematic because
it does not meet some security validation criteria, it can simply be
dropped. eBPF is a highly efficient way to implement this because it
can hook into the relevant parts of the network stack in the kernel,
or even on the network interface card.1 This means out-of-policy or
malicious packets can be dropped before incurring the processing
costs of being handled by the networking stack and passed to user
space.
One of the early uses of eBPF in production at scale was for DDoS
(distributed denial of service) protection at Cloudflare. A DDoS
attacker floods a target machine with many network messages, in
the hope that the target can’t process them quickly enough and
becomes so busy handling these messages that it can’t do useful
work. Cloudflare engineers use eBPF programs to examine packets
as soon as they arrive, and quickly determine whether a packet is
part of such an attack, discarding them if so. The packet doesn’t have
to pass through the kernel’s networking stack so it takes far fewer
resources to process, and the target can cope with a much higher
rate of malicious traffic.
1 XDP or eXpress Data Path hooks are supported by some network interface cards and
drivers, allowing the eBPF program to be offloaded out of the kernel entirely.
Security | 41
eBPF programs have also been used as a dynamic mitigation against
“packet of death” kernel vulnerabilities.2 An attacker crafts a net‐
work packet in such a way that it exploits a bug in the kernel
that prevents it from processing that packet properly. Rather than
waiting for a kernel patch to roll out, the attack can be mitigated
by loading an eBPF program that looks for these specifically crafted
packets and drops them. The real beauty of this is that the eBPF pro‐
gram can be loaded dynamically without having to change anything
on the machine.
In Kubernetes, network policy is a first-class resource, but it’s left
to the networking plug-in to enforce it. Some CNIs, including
Cilium and Calico, offer extended network policy capabilities for
more powerful rules, such as allowing or disallowing traffic to a
destination specified by fully qualified domain name rather than just
by IP address. There’s a good tool for exploring network policies and
their effects at app.networkpolicy.io, shown in Figure 6-5.
Standard Kubernetes network policy rules apply to traffic to and
from application pods, but since eBPF has visibility over all network
traffic, it can be used for host firewall capabilities too, restricting
traffic to and from a host (virtual) machine.3
eBPF can also be leveraged to provide transparent encryption,
whether through WireGuard or IPsec.4 Here, transparent means that
the application doesn’t need any modifications—in fact the applica‐
tion can be entirely unaware that its network traffic is encrypted.
2 Daniel Borkmann discussed this in his talk, “BPF as a Fundamentally Better Dataplane”
(eBPF Summit (virtual), 2020).
3 See Cilium’s Host Firewall documentation.
4 Tailscale has a comparison of these two encryption protocols.
Runtime Security
eBPF is also being used to build tools that detect when applications
behave in unexpected or malicious ways, and some of these tools
can also be used to prevent bad behavior. A few examples of suspi‐
cious behavior might include accessing files unexpectedly, running
executable programs, or attempting to gain additional privileges.
In fact, you may well have used BPF-based security enforcement in
the form of seccomp, a Linux feature for limiting the set of syscalls
that any application can call.
Security | 43
The CNCF project Falco extended this idea of limiting the syscalls
that an application can make. Falco’s rule definitions are created
in YAML, which is easier for humans to read and interpret than
seccomp profiles. The default Falco driver is a kernel module, but
there is also an eBPF probe driver that attaches to “raw syscall”
events. It doesn’t prevent those syscalls being completed, but it can
generate logs or other notifications to alert operators to a potentially
malicious event.
As we saw in Chapter 3, eBPF programs can be attached to the
LSM interface to prevent malicious behaviors or to mitigate known
vulnerabilities. For example, Denis Efremov wrote an eBPF program
to prevent exec*() system calls being run if they are not passed
any arguments, in order to mitigate the PwnKit5 high-severity vul‐
nerability. eBPF can also be used to mitigate against speculative
execution “Spectre” attacks.6
Tracee is another open source project for runtime security using
eBPF. As well as syscall-based checks, it also uses the LSM interface.
This helps avoid being vulnerable to TOCTTOU race conditions
that are possible when only checking syscalls. Tracee supports rules
defined in Open Policy Agent’s Rego language, and also allows for
plug-in rules defined in Go.
The Tetragon component of Cilium offers another powerful
approach, using eBPF to monitor the four golden signals of container
security observability: process execution, network sockets, file access,
and layer 7 network identity. This allows operators to see exactly
what was responsible for any malicious or suspicious event, down
to the executable name and user identity within a specific pod. For
example, if you were subject to a cryptocurrency mining attack,
you could see exactly what executable opened a network connection
to a mining pool, from what pod, and when. These forensics are
invaluable for understanding how the compromise happened and
making it easy to build security policies to prevent similar attacks in
future.
5 See Bharat Jogi’s blog, “PwnKit: Local Privilege Escalation Vulnerability” (Qualys, Janu‐
ary 25, 2022).
6 See Daniel Borkmann’s talk, “BPF and Spectre: Mitigating Transient Execution Attacks”
(eBPF Summit (virtual), August 18–19, 2021).
This doesn’t mean we should use eBPF for everything! It’s unlikely
to make sense to write business-specific applications in eBPF, any
more than we would typically write applications as kernel modules.
There may be some exceptions to this rule, perhaps for extremely
high performance requirements like high-frequency trading. In the
main, eBPF comes into its own for tooling that instruments other
applications, as we’ve seen in this chapter.
7 Natália Ivánkó and Jed Salazar, Security Observability with eBPF (O’Reilly, 2022).
Security | 45
CHAPTER 7
Conclusion
I hope this short report has given you an understanding of eBPF and
why it’s so powerful. What I really hope is that you’re ready to try
out some eBPF-based tools for yourself!
If you want to dive deeper on the technical side, a good place
to start is ebpf.io, where you’ll find more information about the
technology and the eBPF Foundation. For coding examples, I have
some resources in my ebpf-beginners repository on GitHub.
To learn about how others are leveraging eBPF tools, join events like
eBPF Summit and Cloud Native eBPF Day where users share their
successes and learnings. There is an active Slack channel that you
can reach from ebpf.io/slack. I hope to see you there!
47
About the Author
Liz Rice is the chief open source officer with cloud native network‐
ing and security specialists Isovalent, creators of the Cilium eBPF-
based networking project. She was chair of the CNCF’s Technical
Oversight Committee from 2019–2022, and was co-chair of Kube‐
Con + CloudNativeCon in 2018. She is also the author of Container
Security, published by O’Reilly. She has a wealth of software devel‐
opment, team, and product management experience from working
on network protocols and distributed systems, and in digital tech‐
nology sectors such as VOD, music, and VoIP. When not writing
code, or talking about it, Liz loves riding bikes in places with better
weather than her native London, and competing in virtual races on
Zwift.