This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Test suite results for ARM with uprobes


Hi Wade,

On Tue, 2011-12-06 at 12:45 -0700, Wade Farnsworth wrote:
> I ran the systemtap test suite on a beagleboard with uprobes support 
> enabled.

Thanks!

>   However, I'm running into several errors.
> 
> My configuration is as follows:
> * systemtap git rev f52d32a9f57d228627ee08e39f0bbcf3f3faae20.
> * kernel 2.6.37.6
> * gcc is 4.5.1

I don't yet have a setup that includes arm uprobes, sorry. But I do have
results for the non-uprobe side of things based on almost the same git
checkout, kernel 2.6.40.4-6.fc15.armv7hl.tegra and gcc (GCC) 4.6.1
20110908 (Red Hat 4.6.1-9).

> The full logs may be found at:
> 
> http://dl.dropbox.com/u/40714612/stap-logs-beagleboard-20111206.tar.gz
> 
> I realize this is quite a large list to work through, but I would be 
> very appreciative if anyone could shed any light on them.  If there's 
> anything obvious that I'm missing I'd be glad to hear it.  Otherwise, 
> I'll continue digging at them.

My results can be found at:
http://web.elastic.org/~dejazilla/viewsummary.php?summary=%3D%27%
3C20111202161840.2C07549CD7%40springer.wildebeest.org%3E%27
Lets compare at least the non-uprobes dependent tests.

> First, four tests are causing hangs, panics or other kernel errors.

In general I found that arm kernels pre-3.0 (2.6.40 in fedora speak)
were somewhat unstable. There were lots of kprobe cleanups in 3.0.

> I 
> had to disable these tests in order to proceed further:
> 
> * systemtap.base/pr10854.exp results in the following:
> INIT: Id "S" respawning too fast: disabled for 5 minutes
> Kernel panic - not syncing: Attempted to kill init!

This one takes 60 seconds, but does PASS for me.

> * systemtap.base/itrace.exp will hang in run_ls_1_sec at the line "catch 
> {wait -i $exe_id}".  It looks like the ls process never exits, so we 
> just spin on this line.  If I comment out this line, I hit the kernel 
> BUG in linux/kernel/exit.c:forget_original_parent() followed by a NULL 
> pointer dereference in do_exit().

This one is UNTESTED for me because no kernel utrace support found.

> * systemtap.examples/check.exp gets an "Unable to handle kernel paging 
> request at virtual address" error

This is a lot of tests. It is a run of all the example scripts we ship
with systemtap. It takes 52 minutes to run this whole test on my setup.
Although there are 24 UNTESTED runs in this test, all other 162 PASS.
There are no FAILs here.

> * systemtap.unprivileged/unprivileged_myproc.exp hits the BUG at 
> runtime/uprobes2/uprobes.c:uprobe_free_task()

I am unable to run this test because I don't have uprobes setup here
yet.

> I'm guessing that there may be some bugs in the ARM uprobes 
> implementation, but I haven't yet tracked these down any further.
> 
> Once these tests have been disabled, the rest of the test suite will 
> complete.  Here are some of the failures that I'm not sure about and 
> some relevant notes:
> 
> FAIL: cast-scope-m32-O
> FAIL: cast-scope-m32-O2
> [WARNING: Can't parse SDT_V3 operand '[fp,': identifier '$arg1' at 
> /home/root/stuff/systemtap/testsuite/systemtap.base/cast-scope.stp:15:32]
> FAIL: cxxclass-m32
> FAIL: cxxclass-m32-O
> FAIL: cxxclass-m32-O2
> [Got "semantic error: unable to find local 'arg1' near pc 0x842c  in 
> <unknown> 
> /home/root/stuff/systemtap/testsuite/systemtap.base/cxxclass.cxx ( 
> (alternatives: $i $inst): identifier '$arg1' at 
> /home/root/stuff/systemtap/testsuite/systemtap.base/cxxclass.stp:13:24"]

Although I am unable to actually run these tests, this looks like a
genuine bug in the SDT probes parser. I filed a bug report:
http://sourceware.org/bugzilla/show_bug.cgi?id=13474

> FAIL: debugpath-good (eof) [This one puzzles me.  I've built with 
> CONFIG_DEBUG_INFO enabled and the build directory exists in 
> /lib/modules/`uname -r`/build.  Is there something else that I need to 
> do to get systemtap access to the debug info?]

It fails because:
spawn env SYSTEMTAP_DEBUGINFO_PATH=/lib/modules/2.6.37.6-yocto-standard+/build s
tap -e probe kernel.function("vfs_read") {} -wp2
semantic error: missing arm kernel/module debuginfo under '/lib/modules/2.6.37.6
-yocto-standard+/build' while resolving probe point kernel.function("vfs_read")
Pass 2: analysis failed.  Try again with another '--vp 01' option.
FAIL: debugpath-good (eof)

It PASSes for me with:
spawn env SYSTEMTAP_DEBUGINFO_PATH=/usr/lib/debug stap -e probe kernel.function(
"vfs_read") {} -wp2
# probes
kernel.function("vfs_read@fs/read_write.c:306") /* pc=_stext+0x1445e0 */ /* <- k
ernel.function("vfs_read") */
PASS: debugpath-good

The debugpath.exp testcase does:

# Guess where debuginfo is installed
if [file isdirectory /usr/lib/debug] {
  set debuginfo_path "/usr/lib/debug"
} elseif [file isdirectory /lib/modules/$uname/build] {
  set debuginfo_path "/lib/modules/$uname/build"
} else {
  set debuginfo_path "/lib/modules/$uname"
}

So, I guess it is guessing wrongly?

> FAIL: flightrec2 (log file size (5 != 3 + 3)) [stat: cannot stat 
> `flightlog.out.1': No such file or directory]

Also FAILs for me:
FAIL: flightrec2 (log file size (5 != 3 + 3))

> FAIL: global_end (11) [Looks like the queue stats being taken in 
> global_end2.stp is not occurring]

Also FAILs for me:
FAIL: global_end (11)

> FAIL: gtod (0) [events marked "kern" are occurring before events marked 
> "appl"]

Also FAILs for me:
FAIL: gtod (0)
I suspect this testcase is a little flaky, I have seen it fail on other
setups too sometimes.

> FAIL: inlinedvars-m32-O
> FAIL: inlinedvars-m32-O2
> [line 1: expected "call (22,84)"
> Got "  (84,22)"]

That is interesting, it switches the numbers around?
It obviously doesn't run for me. It would be interesting to see the
debuginfo debug-dump of the created binary, the disassembly of function
"m" and the generated stap script C source code. Could you create a bug
report with that info in it?

> FAIL: library sdt_misc * (0 != 15)
> FAIL: library sdt_misc *libsdt* (0 != 15)
> FAIL: library sdt_misc ./libsdt.so (0 != 15)
> FAIL: library printf --ldd (0) (0 != 4)
> [Looks like systemtap needs to be made aware of the ARM library loader]

Yes, I think so.

> FAIL: OVERLOAD2 no expected error

Same here:
FAIL: OVERLOAD2 no expected error

> FAIL: probefunc:.statement.(0xaddr).absolute shutdown (eof) [systemtap 
> appears to be unable to translate the address back into the 
> corresponding symbol.  Could be related to the debugpath failure?]

Indeed, it does not seem to translate the address back into the
corresponding symbol, but I don't think that it is because of the
debugpath failure. I don't immediately know what it is though. This and
all other probefunc tests PASS for me BTW (the answer is
scheduler_tick).

> FAIL: systemtap.base/process_by_cmd.stp -c ./process_by_cmd [Oddly, 
> systemtap is picking up a function return before the process starts. 
> Possibly it's coming from libc?]

I am unable to run this test.

> FAIL: sdt -O2  uprobe
> FAIL: sdt c89  uprobe
> FAIL: sdt c99  uprobe
> FAIL: sdt c99 -pedantic uprobe
> FAIL: sdt gnu99  uprobe
> FAIL: sdt gnu99 -pedantic uprobe
> FAIL: sdt c++98  uprobe
> FAIL: sdt c++98 -pedantic uprobe
> FAIL: sdt gnu++98  uprobe
> FAIL: sdt gnu++98 -pedantic uprobe
> FAIL: sdt c++0x  uprobe
> FAIL: sdt c++0x -pedantic uprobe
> FAIL: sdt gnu++0x  uprobe
> FAIL: sdt gnu++0x -pedantic uprobe
> [semantic error: unable to find local 'arg1' near pc 0x8408  in  call1 
> /home/root/stuff/systemtap/testsuite/systemtap.base/sdt.c ( 
> (alternatives: $a): identifier '$arg1' at 
> /home/root/stuff/systemtap/testsuite/systemtap.base/sdt.stp:8:18.]
> 
> FAIL: sdt_va_args base
> FAIL: sdt_va_args c89
> FAIL: sdt_va_args c99
> FAIL: sdt_va_args gnu99
> FAIL: sdt_va_args c++98
> FAIL: sdt_va_args gnu++98
> FAIL: sdt_va_args c++0x
> FAIL: sdt_va_args gnu++0x
> [similar to the sdt failures above]

I am unable to run these tests, but it would be interesting to see
debuginfo dump, disassembly and generated stap script C source for
these.

> FAIL: compiling sdt.c c89 -pedantic uprobe 
> [/home/root/stuff/systemtap/testsuite/systemtap.base/sdt.c:67:3: error: 
> string length '518' is greater than the length '509' ISO C90 compilers 
> are required to support]

Known issue with older GCC. Fixed in newer GCC versions.

> ERROR: tcl error sourcing 
> /home/root/stuff/systemtap/testsuite/systemtap.base/sdt_misc.exp [ERROR: 
> child process exited abnormally]

I cannot run this test, and don't have clue what is going wrong here.

> FAIL: stmt_rel line numbers [semantic error: multiple addresses for...]

Bleah, that keeps popping up. It is very dependent on the compiler
generated code. Wish we could do something better here. It happens to
PASS on my kernel build though.

> FAIL: 32_BIT_UTRACE_SYSCALL_ARGS startup (eof) [Warning: child process 
> exited with signal 4 (Illegal instruction)]

I don't have a UTRACE enable kernel here.

> FAIL: vta-test-m32-O
> FAIL: vta-test-m32-O2
> [semantic error: failed to retrieve location attribute for local 'a' 
> (dieoffset: 0x181): identifier '$a' at 
> /home/root/stuff/systemtap/testsuite/systemtap.base/vta-test.stp:2:27]

I cannot run this tests. It would be interesting to see the debuginfo
dump for the generated binary.

> FAIL: dtrace_clone2 compilation
> FAIL: dtrace_clone4 compilation
> [WARNING: Can't parse SDT_V3 operand 'r3': identifier '$arg1' at 
> /home/root/stuff/systemtap/testsuite/systemtap.clone/dtrace_clone.stp:6:47]
>
> FAIL: dtrace_fork_exec2 compilation
> FAIL: dtrace_fork_exec4 compilation
> FAIL: dtrace_vfork_exec2 compilation
> FAIL: dtrace_vfork_exec4 compilation
> [Fail similarly to dtrace_clone]

Same as for the other SDT parser bug above. I'll filed:
http://sourceware.org/bugzilla/show_bug.cgi?id=13475

> FAIL: backtrace of yyy_func2 (0)
> FAIL: print_stack of yyy_func2 (0)
> FAIL: backtrace of yyy_func3 (0)
> FAIL: print_stack of yyy_func3 (0)
> FAIL: backtrace of yyy_func4 (0)
> FAIL: print_stack of yyy_func4 (0)
> FAIL: print_stack didn't find systemtap_test_module1 (0)
> FAIL: print_stack didn't find [kernel] (0)
> FAIL: function arguments: unexpected timeout
> FAIL: all pid tests - unexpected EOF
> FAIL: function arguments -- numeric: compilation failed
> FAIL: function arguments -- numeric --kelf --ignore-dwarf: compilation 
> failed
> [semantic error: no match while resolving probe point 
> module("systemtap-test-module2").function("yyy_int")]

These all PASS for me (this was what I worked on last week).
I don't know why, but I know Will was seeing something similar:
http://www.sourceware.org/bugzilla/show_bug.cgi?id=13022
For some reason, it seems fine on my setup, which is why I haven't
really looked into this yet.

> FAIL: usymbols m32
> FAIL: usymbols m32-O
> FAIL: usymbols m32-O2
> [line 2: expected "handler: lib_handler (.+/libusymbols-m32.so)", Got 
> "handler: 0x400b450c (/home/root/stuff/test/libusymbols-m32.so)"]

Symbol lookup again, but this time user space. I have these UNTESTED.

> FAIL: buildok/pretty.stp [Looks like I need to rerun this with the 
> systemtap debug info installed]

This one passes for me:
PASS: buildok/pretty.stp
But only because the testcase skips UTRACE dependent tests.

It looks like the testcase might be using the wrong stap:
WARNING: cannot find module /usr/bin/stap debuginfo: No DWARF
information found
I don't have systemtap itself installed on my machine.

Yep, it probably does, see buildok/pretty.stp has:
  probe process("stap").function("parse_cmdline") {
...
hmmm, need to think how to fix that.

> FAIL: buildok/process_test.stp [semantic error: unable to find local 
> 'signr' near pc 0xc005f5b4  in  handle_signal...]

Same here.
Looks like tapset/signal.stp handle_signal is wrong for ARM.

> FAIL: buildok/scheduler-all-probes.stp [semantic error: no match while 
> resolving probe point kernel.function("__switch_to")]

Same here.
Looks like there is no __switch_to() in ARM kernels.
Note that tapset/scheduler.stp has a couple of:
%( arch != "x86_64" && arch != "ia64" %?
        kernel.function("__switch_to")
%:
        kernel.function("context_switch")
%)
So we need to adjust that for ARM.

> FAIL: buildok/seventeen.stp [semantic error: unable to find local 
> 'nfs_program' near pc 0xc01bc714  in  nfs_fsync_dir]

This one does PASS for me. We should probably compare generated
debuginfo for our kernels.

> FAIL: buildok/syscalls-arch-detailed.stp
> [semantic error: probe point mismatch at position 1]

Same here. We are missing syscall.altstack on ARM it seems.

> FAIL: semok/config_number.stp [semantic error: probe point mismatch at 
> position 0]

This one PASSes for me.
What is CONFIG_NR_CPU set to for your kernel?
It is set to 2 for me.

> FAIL: semok/mangled.stp
> FAIL: semok/pretty.stp
> [similar to buildok/pretty.stp]
>
> FAIL: semok/twentyseven.stp [semantic error: no match while resolving 
> probe point module("no_such_module").function("no_such_function")]
> 
> FAIL: systemtap.stress/current.stp compilation [similar to 
> buildok/process_test.stp]
> 
> FAIL: 32-bit access nd_syscall
> FAIL: 32-bit acct nd_syscall
> FAIL: 32-bit alarm nd_syscall
> FAIL: 32-bit chmod nd_syscall
> FAIL: 32-bit clock nd_syscall
> FAIL: 32-bit dir nd_syscall
> FAIL: 32-bit forkwait nd_syscall
> FAIL: 32-bit futimes nd_syscall
> FAIL: 32-bit itimer nd_syscall
> FAIL: 32-bit link nd_syscall
> FAIL: 32-bit mmap nd_syscall
> FAIL: 32-bit mount nd_syscall
> FAIL: 32-bit net1 nd_syscall
> FAIL: 32-bit openclose nd_syscall
> FAIL: 32-bit readwrite nd_syscall
> FAIL: 32-bit rt_signal nd_syscall
> FAIL: 32-bit select nd_syscall
> FAIL: 32-bit sendfile nd_syscall
> FAIL: 32-bit signal nd_syscall
> FAIL: 32-bit stat nd_syscall
> FAIL: 32-bit statfs nd_syscall
> FAIL: 32-bit swap nd_syscall
> FAIL: 32-bit sync nd_syscall
> FAIL: 32-bit timer nd_syscall
> FAIL: 32-bit trunc nd_syscall
> FAIL: 32-bit uid nd_syscall
> FAIL: 32-bit umask nd_syscall
> FAIL: 32-bit unlink nd_syscall

We are only able to get arguments of these syscalls in registers,
whenever some argument spills onto the stack we ERROR out. This is the
code in tapset/arm/registers.stp _stp_arg().

> FAIL: 32-bit alarm syscall
> FAIL: 32-bit stat syscall

Both also fail for me, but I haven't investigated yet.

Cheers,

Mark


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]