"Learning eBPF" by Liz Rice
header file that defines some useful constants
It's a mechanism to let one dynamically load and run a piece of VERIFIED code within the kernel, without requiring to change kernel source code or load kernel modules.
Means that:
The eBPF programs are event-driven and run when a hook is triggered.
A lot of eBPF frontends exists:
I choose BCC because I have learnt Python, and the examples from the book I use is using BCC.
Some data structures share data between eBPF or userland programs.
Since eBPF programs are forbidden to call arbitrary kernel functions, the kernel provides several helper functions.
Helper functions can improve security and make the eBPF code kernel version agnostic.
from bcc import BPF
prog = """
int hello(void *ctx) {
bpf_trace_printk("Hello World!");
return 0;
}
"""
b = BPF(text=prog)
syscall = b.get_syscall_fnname("execve")
b.attach_kprobe(event=syscall, fn_name="hello")
b.trace_print()
This is for demonstrate the use of simple output.
The output will likely be out-of-order if we run a few eBPF codes that use bpf_trace_printk.
I omitted the same part.
C part
BPF_HASH(counter_table);
int hello(void *ctx) {
u64 uid = bpf_get_current_uid_gid() & 0xFFFFFFFF;
u64 *p = counter_table.lookup(&uid);
u64 counter = 0;
if (p != 0) {
counter = *p;
}
counter++;
counter_table.update(&uid, &counter);
return 0;
}
Python part
while True:
sleep(2)
s = ""
for k,v in b["counter_table"].items():
s += f"ID {k.value}: {v.value}\t"
print(s)
C part
This is for showing the BPF maps.
Notice that the counter_table.lookup is not a valid C source code.
Here I post some of the maps.
See /usr/include/linux/bpf.h for the full list.
Generic maps list:
What is PERCPU?
per-CPU variants, which is to say that the kernel uses a different block of memory for each CPU core’s version of that map.
What are LPM and TRIE?
Longest prefix match.
Prefix tree.
This name is from the middle syllable of "retrieval".
Non-generic maps list:
Notice that BPF_HISTOGRAM is not implemented in the kernel but in BCC.
See this pull request that implements the BPF_HISTOGRAM
C part
BPF_PERF_OUTPUT(output);
struct data_t {
int pid;
int uid;
char command[16];
char message[12];
};
int hello(void *ctx) {
struct data_t data = {};
char message[12] = "Hello World";
data.pid = bpf_get_current_pid_tgid() >> 32;
data.uid = bpf_get_current_uid_gid() & 0xFFFFFFFF;
bpf_get_current_comm(&data.command, sizeof(data.command));
bpf_probe_read_kernel(&data.message, sizeof(data.message), message);
output.perf_submit(ctx, &data, sizeof(data));
return 0;
}
Python part
def print_event(cpu, data, size):
data = b["output"].event(data)
print(f"{data.pid} {data.uid} {data.command.decode()} " + \
f"{data.message.decode()}")
b["output"].open_perf_buffer(print_event)
while True:
b.perf_buffer_poll()
BPF_PERF_OUTPUT(): creating a map for output. This is a better one comparing to printk.
perf_submit(): put data into the map.
The better way to output is using BPF_RINGBUF_OUTPUT.
PF_PERF_OUTPUT is differ from BPF_PERF_ARRAY. They are not the same thing.
At this point, we have some fundamental knowledge in eBPF, specifically bcc.
Variant 1
from bcc import BPF
prog = """
static void world() {
bpf_trace_printk("world");
}
int hello(void* ctx) {
bpf_trace_printk("hello");
world();
}
"""
b = BPF(text=prog)
syscall = b.get_syscall_fnname("execve")
b.attach_kprobe(event=syscall, fn_name="hello")
b.trace_print()
Variant 2
from bcc import BPF
import ctypes
prog = """
BPF_PROG_ARRAY(funcs, 300);
int world(void* ctx) {
bpf_trace_printk("world");
return 0;
}
int hello(void* ctx) {
bpf_trace_printk("hello");
funcs.call(ctx, 1);
return 0;
}
"""
b = BPF(text=prog)
syscall = b.get_syscall_fnname("execve")
b.attach_kprobe(event=syscall, fn_name="hello")
world_fn = b.load_func("world", BPF.KPROBE)
prog_array = b.get_table("funcs")
prog_array[ctypes.c_int(1)] = ctypes.c_int(world_fn.fd)
b.trace_print()
You've reached the end of this page. And you may Go to index or visit my friends.
About me and contacts
Except where otherwise noted, this site is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License