This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Systemtap do_filp_open failure on a few linux packages


Dear all,

I have experienced a very strange issue related to systemtap.
Any insights or help you might be able to provide to help me debug this
further would be most appreciated.

I am developing a linux distribution called KaarPux:
http://kaarpux.kaarposoft.dk/

Using a few scripts, some 600+ linux packages are build and installed.
Generally, this works like a charm.

In order to automatically collect package dependencies, I have created
a small systemtap script to show files opened for reading:
http://sourceforge.net/p/kaarpux/code/ci/be342bf5667253421f562b7bc29bab8e0a2560aa/tree/master/chroot_scripts/kx_open.stp

This script is basically a probe on kernel.function("do_filp_open").return

The script is compiled with
http://sourceforge.net/p/kaarpux/code/ci/be342bf5667253421f562b7bc29bab8e0a2560aa/tree/master/chroot_scripts/install_kx_open_stp.sh

The script is executed with the functions in
http://sourceforge.net/p/kaarpux/code/ci/be342bf5667253421f562b7bc29bab8e0a2560aa/tree/master/shinc/linux_functions.shinc

So, basically a "staprun -o $PIPE -c script_to_build_package"
into a $PIPE created previously

If I try to build all the 600+ packages with this probe enabled,
it ALMOST works.

For most of the 600+ packages, building is successfull, and the probe
returns what seems to be reasonable results.

However, for a few packages, building fails:
- firefox
- thunderbird
- libreoffice
- ghc-binary
- ghc

I am a bit puzzeled.
If I have made some stupid beginners mistake, I would have expected all,
most, or at least a significant number of package builds to fail.
But only those 5 out of 600+ fails...

I have experienced similar problems for the last 6 to 12 months with
different kernel versions, systemtap versions, qemu versions, and KaarPux versions.
So it does not seem to be a glitch with the current version combination.

BTW, I also experienced similar problems with an earlier script:
http://sourceforge.net/p/kaarpux/code/ci/e80f14f67fc7688a4d85661befb2b96a565b206a/tree/master/chroot_scripts/kx_open.stp

I never bothered to debug further, but now I have tried to dig further...

Currently I have:
linux: 3.9.3
systemtap: 2.2.1
firefox: 21.0
thunderbird: 17.0.6
ghc-binary: 7.4.1
ghc: 7.4.1

Host: i7-3970X on P9X79 WS
Virtual Machine: qemu kvm 1.5.0

When building firefox with and without systemtap,
I get 36000+ identical lines in the log (except for some build identifiers),
then with systemtap:

---------- [BEGIN] ----------

Executing: c++ -o plugin-container -Wall -Wpointer-arith -Woverloaded-virtual -Werror=return-type -Wtype-limits -Wempty-body -Wno-invalid-offsetof -Wcast-align -fno-exceptions -fno-strict-aliasing -fno-rtti -ffunction-sections -fdata-sections -fno-exceptions -std=gnu++0x -pthread -pipe -DNDEBUG -DTRIMMED -g -Os -freorder-blocks -fomit-frame-pointer /home/kaarpux/kaarpux/linux/build/opt/firefox-21.0/mozilla-release/obj-x86_64-unknown-linux-gnu/ipc/app/tmpgyKSzm.list -lpthread -Wl,-z,noexecstack -Wl,--build-id -Wl,-rpath-link,/home/kaarpux/kaarpux/linux/build/opt/firefox-21.0/mozilla-release/obj-x86_64-unknown-linux-gnu/dist/bin -Wl,-rpath-link,/opt/kaarpux/firefox-21.0/lib -L../../dist/bin -L../../dist/lib -ldl -L/home/kaarpux/kaarpux/linux/build/opt/firefox-21.0/mozilla-release/obj-x86_64-unknown-linux-gnu/dist/bin -lxpcom -lmozalloc -lxul -L//lib -lplds4 -lplc4 -lnspr4 -lpthread -ldl -Wl,--whole-archive ../../dist/lib/libmozglue.a ../../dist/lib/libmemory.a -Wl,--no-whole-archive -rdynamic -ldl
/home/kaarpux/kaarpux/linux/build/opt/firefox-21.0/mozilla-release/obj-x86_64-unknown-linux-gnu/ipc/app/tmpgyKSzm.list:
    INPUT("MozillaRuntimeMain.o")

/bin/ld: warning: libhunspell-1.3.so.0, needed by ../../dist/bin/libxul.so, not found (try using -rpath or -rpath-link) ../../dist/bin/libxul.so: undefined reference to `Hunspell::spell(char const*, int*, char**)' ../../dist/bin/libxul.so: undefined reference to `Hunspell::Hunspell(char const*, char const*, char const*)' ../../dist/bin/libxul.so: undefined reference to `Hunspell::suggest(char***, char const*)' ../../dist/bin/libxul.so: undefined reference to `Hunspell::get_dic_encoding()'
../../dist/bin/libxul.so: undefined reference to `Hunspell::~Hunspell()'
collect2: error: ld returned 1 exit status

---------- [END] ----------

But libhunspell-1.3.so.0 IS indeed there.
If I retry building WITH systemtap, I get the same result again and again.
Then if I rebuild WITHOUT systemtap, everything is fine.

For thunderbird, the experience is simlar.

For ghc-binary I get

---------- [BEGIN] ----------

configure GHC_BINARY

checking for path to top of build tree... utils/ghc-pwd/dist-install/build/tmp/ghc-pwd: error while loading shared libraries: libgmp.so.3: cannot open shared object file: No such file or directory
configure: error: cannot determine current directory
Warning: child process exited with status 1

---------- [END] ----------

I was thinking this might have something to do with symbolic links, as
http://sourceforge.net/p/kaarpux/code/ci/be342bf5667253421f562b7bc29bab8e0a2560aa/tree/master/packages/g/ghc-binary.yaml
creates two symlinks before configure.

However, again:
If I retry building WITH systemtap, I get the same result again and again.
Then if I rebuild WITHOUT systemtap, everything is fine.

(and there must be many packages with double symlinks anyway...)

For ghc I get

---------- [BEGIN] ----------

configure: WARNING: unrecognized options: --disable-dependency-tracking
checking for gfind... no
checking for find... /bin/find
checking for sort... /bin/sort
checking version of ghc... unknown
configure: error: Cannot determine the version of /home/kaarpux/kaarpux/linux/build/opt/ghc-binary-7.4.1/bin/ghc. Is it really GHC?
Warning: child process exited with status 1

---------- [END] ----------

And again:
If I retry building WITH systemtap, I get the same result again and again.
Then if I rebuild WITHOUT systemtap, everything is fine.


So, now I am stuck.

I could understand that the output of my probe would not be as I expeced.
Fine.
But how could a simple probe like this make building a package fail ???

Any input, help or comments would be most appreciated.

/Henrik


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]