www.huawei.com
HUAWEI TECHNOLOGIES CO., LTD.
Performance Monitoring and Analysis
Using perf+BPF
Wang Nan (王楠) / [email protected]
2016/08/17
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 2
kTap overview
Announced in LinuxCon Japan 2013
Inspired by DTrace and SystemTap
In-kernel Lua virtual machine
Lua Frontend
Source: http://events.linuxfoundation.org/sites/events/files/lcjpcojp13_zhangwei.pdf
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 3
kTap overview
Almost gets into 3.13
But rejected by Ingo
Main reason: Kernel already has BPF virtual machine
Source: http://lwn.net/Articles/572788/
Source: http://lwn.net/Articles/572793/
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 4
kTap overview
We restarted in 2014
kTap perf + BPF
VM Lua VM BPF
Language Lua C
Frontend Stand alone perf
Upstream Not yet Almost done
Motivation
Highlight features
perf+BPF
Backward ring buffer
The next big thing
Summary
Contents
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 6
Motivation: Tracing
What
happen on
this frame?
Android Draw Profiler Source: https://developer.android.com/tools/performance/profile-gpu-rendering/index.html
TraceCompass result. WoW! RenderThread tooks most of cycles! * systrace can do similar thing but we don’t want to be bounded to Android
playbackStart playbackEnd
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 7
Motivation: Tracing
Basic workflow
# perf record –e sched:sched_switch --exclude-perf –e raw_syscalls:* --exclude-perf –a sleep 10
Tracing:
Converting (from perf.data to CTF): # perf data convert --to-ctf ./out.ctf
Converting (from CTF to R table, thanks to libbabeltrace): # python3 convert.py ./out.ctf > ./out.data
# R
...
> read.table(“./out.data”, header=TRUE, ...)
perf.data CTF R table Format:
Tools: perf report perf script flamegraph
babeltrace TraceCompass Ad-hoc python script
Ad-hoc R script
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 8
Motivation: Tracing
Requirements for tracer
Fine grained (syscall, sched:switch, samples with call stack)
Affordable data volume
conflict
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 9
Motivation: Tracing
Requirements for tracer
Fine grained (syscall, sched:switch, samples with call stack)
Affordable data volume
conflict
Filtering: drop unneeded records Aggregating: profiling during recording (e.g. building histogram)
Use BPF
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 10
BPF (Berkeley Packet Filter)
Event
Perf event
BPF VM
Drop
perf.data
Rules: if pid == 1234 …
BPF Byte code: …
cmp r0, 1234
…
Tracepoints kprobes uprobes
return 0
BPF for Event filtering
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 11
BPF (Berkeley Packet Filter)
Features
In-kernel just-in-time compiling
X86_64 and ARM64
In-kernel verifier
Memory access
Restricted API
Turing incomplete: ensure to halt
BPF maps
Workflow
myscript.c Byte code loaded llvm bpf(2) ioctl attached
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 12
Motivation
Highlight features
perf+BPF
Backward ring buffer
The next big thing
Summary
Contents
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 13
Improved BPF support
Perf support BPF scripts
perf record –e myscript.c –a
#define SEC(NAME) __attribute__((section(NAME), used))
/* filter.c */
SEC(“func=sys_write”)
int func(void *ctx) {
u64 pid = get_current_pid_tgid() & 0xffffffff;
char fmt[] = “pid %d calls sys_write”;
if (pid == 1234)
return 1;
else
trace_printk(fmt, sizeof(fmt), pid)
return 0
}
# perf record –e ./filter.c –a sleep 10
# perf script
rtkit-daemon 1234 [002] 122474.586545: ...
rtkit-daemon 1234 [004] 122475.345923: ...
...
Loading .c requires external clang: # cat ~/.gitconfig
[llvm]
clang-path=/path/to/clang
Pre-compile to .o for smart phone: # clang –c myscript.c –target bpf –O2
...
# perf record –e myscript.o ...
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 14
Improved BPF support
What can we do in BPF scripts?
Fetch arguments
Read from PMU through perf event
Output to perf.data
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 15
Improved BPF support
Fetch arguments
SEC(“func=sys_write fd”)
int func(void *ctx, int err, int fd) {
...
}
SEC(“func=null_lseek file->f_mode offset orig”)
int bpf_func__null_lseek(void *ctx, int err,
unsigned long f_mode,
unsigned long offset,
unsigned long orig){
...
}
Fetching function parameters
Fetching struct member
Prologue generation void *file = ctx->di;
int offset = ctx->si;
int orig = ctx->dx;
int f_mode;
bpf_probe_read(&f_mode, 8, file + offsetof(struct file, f_mode))
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 16
Improved BPF support
Read from PMU through perf event
struct bpf_map_def SEC("maps") pmu_map = {
.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
.key_size = sizeof(int),
.value_size = sizeof(int),
.max_entries = __NR_CPUS__,
};
...
val = perf_event_read(&pmu_map, get_smp_processor_id());
...
# perf record –a -i -e cycles/period=0x7fffffffffffffff,name=cyc/ \
-e './test_bpf_map_2.c/map:pmu_map.event=cyc/’
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 17
Improved BPF support
Output to perf.data
struct bpf_map_def SEC("maps") __bpf_stdout__ = {
.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
.key_size = sizeof(int),
.value_size = sizeof(u32),
.max_entries = __NR_CPUS__,
};
...
char output_str[] = "Raise a BPF event!";
err = perf_event_output(ctx, &__bpf_stdout__,
get_smp_processor_id(),
&output_str, sizeof(output_str));
...
# perf trace -e nanosleep --ev test_bpf_stdout.c usleep 1
0.007 ( 0.007 ms): usleep/729 nanosleep(rqtp: 0x7ffc5bbc5fe0) ...
0.007 ( ): __bpf_stdout__:Raise a BPF event!..)
0.008 ( ): perf_bpf_probe:func_begin:(ffffffff81112460))
0.069 ( ): __bpf_stdout__:Raise a BPF event!..)
0.070 ( ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92))
0.072 ( 0.072 ms): usleep/729 ... [continued]: nanosleep()) = 0
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 18
Example: Per-function IPC
Question: what’s the IPC of function X?
SEC(“exec=/path/to/exec;”
"f_entry=functionX")
int f_entry(void *ctx) {
...
cycles = bpf_perf_event_read(&cycles_pmu, ...);
instructions = bpf_perf_event_read(&instructions_pmu, ...);
bpf_map_update_elem(... &cycles, ...);
bpf_map_update_elem(... &instructions, ...);
...
return 0;
}
SEC(“exec=/path/to/exec;”
"f_return=functionX%return")
int f_return(void *ctx) {
...
cycles = bpf_perf_event_read(&cycles_pmu, ...);
instructions = bpf_perf_event_read(&instructions_pmu, ...);
...
bpf_output_trace_data(&output, sizeof(output));
...
return 0;
}
uprobe at function entry: Read instruction and cycles
uprobe at function exit: Read again, compute, output
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 19
Tracing rare outliners
How to trace the outliner?
Continuous perf record
Offline searching
Android Draw Profiler Source: https://developer.android.com/tools/performance/profile-gpu-rendering/index.html
What
happen on
this frame?
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 20
Tracing rare outliners
How to trace the outliner?
Dump ring buffer to perf.data when
outliner Is observed!
Problem
Reading from perf ring buffer
Android Draw Profiler Source: https://developer.android.com/tools/performance/profile-gpu-rendering/index.html
What
happen on
this frame?
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 21
Tracing rare outliners
perf ring buffer
Event 1 Event 2 Event 3
Write Direction
Write pointer Read pointer
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 22
Tracing rare outliners
perf ring buffer
Event 1 Event 2 Event 3
Write Direction
Write pointer Read pointer
type, size … Payload
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 23
Tracing rare outliners
perf ring buffer: overwrite
Where to start reading?
Event 2 Event 3
Write Direction
Write pointer
Eve… …nt 1 …nt4
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 24
Tracing rare outliners
Backward perf ring buffer
Event 2
Write Direction
Event 3
Write pointer Read pointer
Event 1
Read Direction
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 25
Tracing rare outliners
Backward perf ring buffer
Event 4,3,2 can be retrived
Event 2
Write Direction
Event 3 …nt4 Eve… Eve…
Write pointer Read pointer
Read Direction
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 26
Tracing rare outliners
perf support backward ring buffer:
perf record --overwrite , --switch-output, --tail-synthesize
perf runs in background
Silent: no CPU and IO cost
Capture fine grained trace close to outliners
# perf record -m 4 -e raw_syscalls:* -g --overwrite --switch-output --tail-synthesize \
dd if=/dev/zero of=/dev/null &
[1] 30498
# kill –s SIGUSR2 30498
[ perf record: dump data: Woken up 1 times ]
[ perf record: Dump perf.data.2016081303042824 ]
# kill –s SIGUSR2 30498
[ perf record: dump data: Woken up 1 times ]
[ perf record: Dump perf.data.2016081303054379 ]
# ls –l ./perf.data.*
-rw------- 1 root root 38637 Aug 13 03:04 ./perf.data.2016081303042824
-rw------- 1 root root 38581 Aug 13 03:05 ./perf.data.2016081303054379
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 27
Tracing rare outliners
Tracing based profiling
Android Draw Profiler Source: https://developer.android.com/tools/performance/profile-gpu-rendering/index.html
What
happen on
this frame?
How to find the outliner?
Dump data after the outliner is observed
perf record --overwrite --switch-output
Send SIGUSR2 to perf
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 28
Tracing rare outliners
Tracing based profiling
Android Draw Profiler Source: https://developer.android.com/tools/performance/profile-gpu-rendering/index.html
What
happen on
this frame?
How to find the outliner?
Dump data after the outliner is observed
perf record --overwrite
Send SIGUSR2 to perf
Good enough?
How to detect the outliner?
External profiler
How to trigger data dumping?
Script / hand
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 29
Tracing rare outliners
Tracing based profiling
Android Draw Profiler Source: https://developer.android.com/tools/performance/profile-gpu-rendering/index.html
What
happen on
this frame?
How to find the outliner?
Dump data after the outliner is observed
perf record --overwrite
Send SIGUSR2 to perf
Good enough?
How to detect the outliner?
BPF script
How to trigger data dumping?
BPF script
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 30
Motivation
Highlight features
perf+BPF
Backward ring buffer
The next big thing
Summary
Contents
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 31
The next big thing
perf
script.c
clang
perf
script.c
clang
LLVM
script.o
BPF loader
obj
perf
script.c
clang
LLVM
BPF
loader
obj JIT
perf
hooks
perf
script.x
One-liner
clang
LLVM
BPF
loader
obj JIT
Perf
hooks
Other
frontends
Eliminate external clang dependency Good for Android
Tracing, actions and data processing in one script Not only for profiling: dynamic tuning
Better frontend Support one-liners Easy to use
Next big thing: Integrating clang and LLVM into perf
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 32
Motivation
Highlight features
perf+BPF
Backward ring buffer
The next big thing
Summary
Contents
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 33
Summary
Perf and BPF integration
Compile BPF using external clang
Load and attach to kernel event
Many BPF features
Backward ring buffer
Perf run background
Dump trace when SIGUSR2 is received
Combine then together
Improve perf to better support tracing based profiling
Capture rare outliner
All of the above are ready in Linux v4.8-rc1
HUAWEI TECHNOLOGIES CO., LTD. Huawei proprietary. No spread without permission. Page 34
Questions?
Thank you www.huawei.com
Copyright© 2011 Huawei Technologies Co., Ltd. All Rights Reserved.
The information in this document may contain predictive statements including, without limitation, statements regarding the future financial
and operating results, future product portfolio, new technology, etc. There are a number of factors that could cause actual results and
developments to differ materially from those expressed or implied in the predictive statements. Therefore, such information is provided for
reference purpose only and constitutes neither an offer nor an acceptance. Huawei may change the information at any time without notice.
Top Related