This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: stptracer-20060828 has released.
- From: Li Guanglei <guanglei at cn dot ibm dot com>
- To: Li Guanglei <guanglei at cn dot ibm dot com>
- Cc: Masami Hiramatsu <masami dot hiramatsu dot pt at hitachi dot com>, SystemTAP <systemtap at sources dot redhat dot com>, "Jose R. Santos" <jrs at us dot ibm dot com>, Satoshi Oshima <soshima at redhat dot com>, Hideo Aoki <haoki at redhat dot com>, Yumiko Sugita <yumiko dot sugita dot yf at hitachi dot com>
- Date: Thu, 31 Aug 2006 16:57:53 +0800
- Subject: Re: stptracer-20060828 has released.
- Organization: IBM CSTL
- References: <44F58A51.6060806@hitachi.com> <44F69E3C.10308@cn.ibm.com>
Li Guanglei wrote:
[...]
Here you can see that gBTI is around 23% faster than _stp_printf(). In
fact a long time ago I used oprofile to find out where _stp_printf()
cost cpu time, and it showed that a lot time is spent on _stp_vsnprintf().
Attached is the profiling data I got a minute ago by running
the tests I mentioned in my mail. I only kept the leading few lines of
the generated profiling file.
oprofile.all is generated by:
opreport -p STP_MODULE_DIR -lg > oprofile.all
oprofile.kernel is generated by:
opreport /boot/vmlinux-`uname -r`.debug -lg > oprofile.kernel
- Guanglei
CPU: ppc64 POWER5, speed 1502.5 MHz (estimated)
Counted CYCLES events (Processor Cycles using continuous sampling) with a unit mask of 0x00 (No unit mask) count 100000
samples % linenr info symbol name
102151 19.5493 op_model_power4.c:183 .hypervisor_bucket
43215 8.2703 (no location information) restore
34929 6.6846 kprobes.c:442 .kprobe_exceptions_notify
24473 4.6835 (no location information) .memcpy
19136 3.6622 sys.c:1351 .sys_getsid
16137 3.0882 (no location information) single_step_pSeries
14028 2.6846 spinlock.c:154 ._spin_lock
13476 2.5790 sys.c:133 .notifier_call_chain
12093 2.3143 (no location information) program_check_common
12072 2.3103 (no location information) syscall_error_cont
11440 2.1893 (no location information) program_check_pSeries
10630 2.0343 (no location information) .__copy_tofrom_user
8283 1.5852 spinlock.c:66 ._read_lock
8176 1.5647 (no location information) .ret_from_except_lite
8027 1.5362 buffer.c:1433 .__find_get_block
7561 1.4470 (no location information) single_step_common
7189 1.3758 (no location information) system_call_common
6229 1.1921 hash.h:33 .get_kprobe
5973 1.1431 (no location information) system_call_pSeries
5599 1.0715 (no location information) .save_nvgprs
5338 1.0216 traps.c:747 .program_check_exception
5307 1.0156 sys.c:220 .atomic_notifier_call_chain
5180 0.9913 buffer.c:1269 .(bool, long double,...)(long, short,...)
4925 0.9425 sstep.c:56 .emulate_step
4856 0.9293 locks.c:29 .__spin_yield
4427 0.8472 buffer.c:80 .unlock_buffer
3944 0.7548 (no location information) .ret_from_except
3129 0.5988 slab.c:3371 .kmem_cache_free
3035 0.5808 filemap.c:1978 .generic_file_buffered_write
2927 0.5602 traps.c:502 .single_step_exception
2839 0.5433 (no location information) system_call
2827 0.5410 (no location information) syscall_exit
2765 0.5292 pid.c:209 .find_pid
2466 0.4719 kprobes.c:121 .set_current_kprobe
2370 0.4536 slab.c:3145 .kmem_cache_alloc
1931 0.3695 (no location information) .memset
1872 0.3583 wait.c:241 .bit_waitqueue
1825 0.3493 mutex.c:193 .__mutex_lock_slowpath
1804 0.3452 bitops.c:19 .find_next_bit
1797 0.3439 fs-writeback.c:56 .__mark_inode_dirty
1757 0.3362 spinlock.c:73 ._spin_lock_irqsave
1666 0.3188 filemap.c:507 .unlock_page
1616 0.3093 op_model_power4.c:191 .kernel_unknown_bucket
1542 0.2951 page_alloc.c:885 .get_page_from_freelist
CPU: ppc64 POWER5, speed 1502.5 MHz (estimated)
Counted CYCLES events (Processor Cycles using continuous sampling) with a unit mask of 0x00 (No unit mask) count 100000
samples % linenr info image name app name symbol name
738217 44.0662 vsprintf.c:122 lgl.ko lgl ._stp_vsnprintf
133887 7.9921 vsprintf.c:19 lgl.ko lgl .skip_atoi
102151 6.0977 op_model_power4.c:183 vmlinux vmlinux .hypervisor_bucket
67766 4.0451 (no location information) jbd jbd (no symbols)
43215 2.5796 (no location information) vmlinux vmlinux restore
39527 2.3595 lgl.c:7416 lgl.ko lgl .enter_probe_1221
37232 2.2225 (no location information) ext3 ext3 (no symbols)
36332 2.1688 print.c:91 lgl.ko lgl ._stp_print_flush
35788 2.1363 lgl.c:977 lgl.ko lgl .common_probe_prologue
34929 2.0850 kprobes.c:442 vmlinux vmlinux .kprobe_exceptions_notify
24473 1.4609 (no location information) vmlinux vmlinux .memcpy
24229 1.4463 string.c:57 lgl.ko lgl ._stp_sprintf
19136 1.1423 sys.c:1351 vmlinux vmlinux .sys_getsid
17850 1.0655 lgl.c:999 lgl.ko lgl .common_probe_epilogue
16137 0.9633 (no location information) vmlinux vmlinux single_step_pSeries
14028 0.8374 spinlock.c:154 vmlinux vmlinux ._spin_lock
13476 0.8044 sys.c:133 vmlinux vmlinux .notifier_call_chain
12093 0.7219 (no location information) vmlinux vmlinux program_check_common
12072 0.7206 (no location information) vmlinux vmlinux syscall_error_cont
11440 0.6829 (no location information) vmlinux vmlinux program_check_pSeries
10630 0.6345 (no location information) vmlinux vmlinux .__copy_tofrom_user
10415 0.6217 (no location information) oprofiled oprofiled (no symbols)
8283 0.4944 spinlock.c:66 vmlinux vmlinux ._read_lock
8176 0.4880 (no location information) vmlinux vmlinux .ret_from_except_lite
8027 0.4792 buffer.c:1433 vmlinux vmlinux .__find_get_block
7561 0.4513 (no location information) vmlinux vmlinux single_step_common