This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

possible problems with gprof clocktick resolution when analyzing the program/library time ratio using gprof


I have a problem with the total time reported by gmon which I think
may be related to the resolution of gmon.out (10ms).  The case is as
below:

I am undertaking a project which involves implementing a parallel
program with different underlying parallel libraries and in each
implementation, I would like to see how much time goes to the program
itself and how much time goes to the library.

Then I compile both the library with
-g -pg
and I static link when I compile the library as well as linking the
client program using -static

This causes each process to generate its own gmon.out file.  I have
arranged the each process to chdir to its own directory at the
beginning of the run so that at the end, they will be able to write
their own gmon.out file without overwriting each others' file.

I made a flat file output by:

gprof -b -p <program> gmon.out

Then I ran nm on all object files of the client program to extract the
list of client functions, which I use to identify the client programs'
function and I add up the self-time of these functions as the
program's time.  I also add up the self-time of the remaining
functions as other_time.

However the mystery is that the total self-time of all functions are
10 times the actual time my client program took!.  For example, when I
run with 32 nodes, it took 8 seconds.  However gprof says that 30
seconds were spent by the program and 40s spent on elsewhere!

Then when I look at the output of gmon flat profile, I saw some
function being called in the order of 100k times within 8s, that each
call only take very very little time.  So do you think it is the
resolution problem which makes gprof overestimate the call duration of
each call, thus overestimate time taken?  OR can you think of any
other explanations?

Will the ratio calculated be reliable? How can I fix this problem?

Alternatively, do you think there are some problems in multiplication
of time therefore the reported time is always 10x the actual time?

Here is the output of the gmon flat profile:

Each sample counts as 0.01 seconds.
 %   cumulative   self              self     total
time   seconds   seconds    calls   s/call   s/call  name
16.98     14.83    14.83   102672     0.00     0.00  prims_in_box2
14.34     27.35    12.52                             select
13.44     39.08    11.73                             _IO_vfscanf
 8.40     46.41      7.33                             __strtod_internal
 5.26     51.00     4.59                             mprotect
 4.87     55.25     4.25                             sendmsg
 3.79     58.55     3.31   257767     0.00     0.00  VecMatMult
 2.74     60.94     2.39                             _int_malloc
 2.05     62.73     1.79      165     0.01     0.02  Vdc_arrange_view_diff
 1.53     64.07     1.34   132414     0.00     0.00  PolyPeIntersect
 1.43     65.32     1.25                             str_to_mpn
 1.31     66.46     1.14                             recv
 1.13     67.45     0.99    16122     0.00     0.00  Vdc_diff_apply
 1.07     68.38     0.93    92396     0.00     0.00  TriElementBoundBox
 0.93     69.19     0.81    51336     0.00     0.00  subdiv_bintree
 0.92     69.99     0.80    27677     0.00     0.00  push_down_grid
 0.73     70.63     0.64                             __mpn_lshift
 0.70     71.24     0.62                             strlen
 0.69     71.84     0.60     7404     0.00     0.00  TriTransform
 0.68     72.44     0.60                             sigprocmask
 0.66     73.02     0.58                             __strtol_internal
 0.65     73.59     0.57    51750     0.00     0.00  bintree_lookup
 0.62     74.13     0.54                             round_and_return
 0.61     74.66     0.53                             _IO_sputbackc
 0.52     75.11     0.45                             read
 0.48     75.53     0.42    22018     0.00     0.00  sigio_handler
 0.48     75.95      0.42     7404     0.00     0.00  TriDataNormalize
 0.47     76.36     0.41   121097     0.00     0.00  step_grid
 0.46     76.76     0.40                             __profile_frequency
 0.44     77.14     0.38                             __mpn_mul
 0.42     77.51     0.37        1     0.37     1.10  CloseFrameBuffer
 0.42     77.88     0.37   204671     0.00     0.00  ObjectMalloc
 0.41     78.24     0.36                             __mpn_construct_double
 0.41     78.59     0.36     7404     0.00     0.00  TriRead
 0.40     78.94     0.35   123750     0.00     0.00  lookup_emptycells
 0.39     79.28     0.34   108351     0.00     0.00  VecNorm
 0.37     79.60      0.32                             sqrt
 0.33     79.89     0.29      256     0.00     0.00  RunLengthEncode
 0.33     80.18     0.29                             gettimeofday
 0.32     80.46     0.28                             hasmntopt
 0.32     80.74     0.28                             __mpn_mul_1
 0.30     81.00     0.26     8147     0.00     0.00  init_ray
 0.27     81.24     0.24                             vfprintf
 0.26     81.47      0.23    54052     0.00     0.00  NormalizeBoundBox
 0.25     81.69     0.22                             __mpn_divrem
 0.25     81.91     0.22                             send
 0.25     82.13     0.22     7404     0.00     0.00  TriBoundBox
 0.25     82.34     0.22                             malloc
 0.23     82.54     0.20    54941     0.00     0.00  lookup_hashtable
 0.23     82.74     0.20        1     0.20     6.65  ReadGeoFile
 0.22     82.94     0.20                             memcpy
 0.19     83.11     0.17                             __errno_location
 0.19     83.27     0.17    40485     0.00     0.00  next_nonempty_voxel
 0.18     83.43     0.16   250449     0.00     0.00  LocalMalloc
 0.16     83.57     0.14    16303     0.00     0.00  IntersectHuniformPrimlist
 0.15     83.70     0.13      414     0.00     0.04  create_bintree
 0.15     83.83     0.13        1     0.13     0.13  OpenFrameBuffer
 0.14     83.95     0.12                             flockfile
 0.14     84.07     0.12    33953     0.00     0.00  next_nonempty_leaf
 0.13      84.18     0.11      414     0.00     0.00  deleteBinTree
 0.13     84.29     0.11                             fscanf
 0.13     84.40     0.11                             isnan
 0.13     84.51     0.11                             sYSMALLOc
 0.11     84.61     0.10     7135     0.00     0.00  TriPeIntersect
 0.11     84.71     0.10      375     0.00     0.00  Vdc_create_diff
 0.11     84.81     0.10        1     0.10    17.17  create_grid
 0.10      84.90     0.09        1     0.09     0.09  Vdc_page_init
 0.09     84.98     0.08    31286     0.00     0.00  insert_in_hashtable
 0.09     85.06     0.08     5502     0.00     0.00  Vdc_acquire_view
 0.09      85.14     0.08        1     0.08     2.75  NormalizeGeo
 0.08     85.21     0.07        1     0.07     0.07  MakeElementArray
 0.08     85.28     0.07                             strtof
 0.08     85.35     0.07     5233     0.00     0.00  AddPixelColor
 0.07     85.42     0.07   102762     0.00     0.00  next_voxel
 0.07     85.48     0.06     4098     0.00     0.00  shm_resize
 0.07     85.54     0.06     1457     0.00     0.00  Shade
 0.07     85.60     0.06                             _IO_new_file_xsputn
 0.07     85.66     0.06                             vfscanf
 0.06     85.71     0.05     1024     0.00     0.00  PutJob
 0.06     85.76     0.05        1     0.05     7.08  RayTrace
 0.06     85.81     0.05                             __libc_csu_fini
 0.06     85.86     0.05                             __mpn_rshift
 0.06     85.91     0.05                             vfork
 0.06     85.96     0.05                             write
 0.05     86.00     0.05     5233     0.00     0.00  PushRayTreeStack
 0.05     86.04     0.04                             _IO_default_seekoff
 0.05     86.08     0.04                             __ieee754_fmod
 0.05     86.12     0.04                             __mpn_construct_float
 0.05     86.16     0.04                             brk
 0.05      86.20     0.04                             buffered_vfprintf
 0.04     86.24     0.04     5502     0.00     0.00  Vdc_release_view
 0.04     86.27     0.04                             funlockfile
 0.04     86.31     0.04                             setitimer
 0.04     86.34     0.04    31286     0.00     0.00  mark_nonempty
 0.03     86.37     0.03     5502     0.00     0.00  Vdc_create_view
 0.03     86.40     0.03                             asprintf
 0.03     86.43     0.03                             mmap
 0.03     86.46     0.03                             socket
 0.03     86.49     0.03     3776     0.00     0.00   ConvertPrimRayJobToRayMsg
 0.03     86.52     0.03     2914     0.00     0.00  HuniformShadowIntersect
 0.03     86.55     0.03      352     0.00     0.00  Vdc_merge_diff
 0.03     86.58     0.03                             fmod
 0.03     86.61     0.03     7422     0.00     0.00  Vdc_page_req_handler
 0.03     86.63     0.03                             _IO_default_xsputn
 0.02     86.65     0.02     9615     0.00     0.00  DEBUG
 0.02     86.67     0.02     9615     0.00     0.00  DebugIsEnabled
 0.02     86.69     0.02     5233     0.00     0.00  TraverseHierarchyUniform
 0.02     86.71     0.02      377     0.00     0.00  diff_initial
 0.02     86.73     0.02        1     0.02     0.17  InitWorkPool
 0.02     86.75     0.02                             LocalFree
 0.02     86.77     0.02                             Rotate
 0.02     86.79     0.02                             Vdc_simple_ack
 0.02     86.81     0.02                             __find_specmb
 0.02     86.83     0.02                             pipe
 0.02     86.85     0.02                             read_int
 0.02     86.87     0.02                             strcmp
 0.02     86.89     0.02                             mremap_chunk
 0.02     86.90     0.02                             perror
 0.02     86.92     0.02                             strerror_r
 0.02     86.93     0.02      375     0.00     0.00  Vdc_merge_dirty_page
 0.02     86.95     0.02       31     0.00     0.00  Vdc_accept
 0.02     86.96     0.02                             pow
 0.01     86.97     0.01    35824     0.00     0.00  ma_rayinfo
 0.01     86.98     0.01    20464     0.00     0.00  mark_empty
 0.01     86.99     0.01    18335     0.00     0.00  free_rayinfo
 0.01     87.00     0.01    17756     0.00     0.00  Vdc_buff_alloc
 0.01     87.01     0.01    10466     0.00     0.00  CopyRayMsg
 0.01     87.02     0.01     7552     0.00     0.00  frand
 0.01     87.03     0.01     4098     0.00     0.00  Vdc_malloc
 0.01     87.04     0.01     4098     0.00     0.00  shm_malloc
 0.01     87.05     0.01     1457     0.00     0.00  SpecularDirection
 0.01     87.06     0.01      450     0.00     0.00   PolyElementBoundBox
 0.01     87.07     0.01      375     0.00     0.00  Vdc_make_twin
 0.01     87.08     0.01      238     0.00     0.00  GetJob
 0.01     87.09     0.01      237     0.00     0.00  GetJobs
 0.01     87.10     0.01      225     0.00     0.00  PolyDataNormalize
 0.01     87.11     0.01      225     0.00     0.00  PolyTransform
 0.01     87.12     0.01      143     0.00     0.00  Vdc_gather_view_diff
 0.01     87.13     0.01       62     0.00     0.00  Vdc_barrier_req_handler
 0.01     87.14     0.01        2     0.01     0.16  Vdc_barrier
 0.01     87.15     0.01        1     0.01     0.01  DebugInit
 0.01     87.16     0.01                             _IO_file_close
 0.01     87.17     0.01                             _IO_file_read
 0.01     87.18     0.01                             __ieee754_pow
 0.01     87.19     0.01                             __new_exitfn
 0.01     87.20     0.01                             __res_vinit
 0.01     87.21     0.01                             __uflow
 0.01     87.22     0.01                             _dl_start
 0.01     87.23     0.01                             mempcpy
 0.01     87.24     0.01                             munmap
 0.01     87.25     0.01                             open
 0.01     87.26     0.01                             recvfrom
 0.01     87.27     0.01        1     0.01     0.01  InitRayTreeStack
 0.01     87.27     0.01        1     0.01    32.82  main
 0.01     87.28     0.01                             TriIntersect
 0.01     87.28     0.01                             Vdc_view_req_dup_handler
 0.01     87.29     0.01                             _IO_default_uflow
 0.01     87.29     0.01                             finite
 0.01     87.30     0.01                             sigaction
 0.01     87.30     0.01                             strcasecmp
 0.00     87.30     0.00    18335     0.00     0.00  pop_up_a_grid
 0.00     87.30     0.00    16294     0.00     0.00  reset_rayinfo
 0.00     87.30     0.00     9009     0.00     0.00  PopRayTreeStack
 0.00     87.30     0.00     4012     0.00     0.00  GetRayJobFromBundle
 0.00     87.30     0.00     3776     0.00     0.00  TransformViewRay
 0.00     87.30     0.00     1457     0.00     0.00  PolyNormal
 0.00     87.30     0.00      567     0.00     0.00  segv_handler
 0.00     87.30     0.00      414     0.00     0.00  init_bintree
 0.00     87.30     0.00      375     0.00     0.00  Vdc_free_twin
 0.00     87.30     0.00      375     0.00     0.00  get_view_ptr
 0.00     87.30     0.00      374     0.00     0.00  Vdc_insert_diff
 0.00     87.30     0.00      225     0.00     0.00  PolyBoundBox
 0.00     87.30     0.00      225     0.00     0.00  PolyRead
 0.00     87.30     0.00      204     0.00     0.00  Vdc_view_req_handler
 0.00      87.30     0.00      177     0.00     0.00  alloc_view_buffer
 0.00     87.30     0.00       73     0.00     0.00  buff_expand
 0.00     87.30     0.00       73     0.00     0.00  buff_initialize
 0.00     87.30     0.00       62     0.00     0.00  get_socket_fd
 0.00     87.30     0.00       34     0.00     0.00  Vdc_err
 0.00     87.30     0.00       32     0.00     0.01  Vdc_acquire_Rview
 0.00     87.30     0.00       32     0.00     0.00  Vdc_release_Rview
 0.00     87.30     0.00       32     0.00     0.00  add_host
 0.00     87.30     0.00       31     0.00     0.00  Vdc_accept_initialize
 0.00     87.30     0.00        31     0.00     0.00  Vdc_connect
 0.00     87.30     0.00       31     0.00     0.00  get_process_commandline
 0.00     87.30     0.00       31     0.00     0.00  start_process
 0.00     87.30     0.00        31     0.00     0.00  wait_until_remote_started
 0.00     87.30     0.00       19     0.00     0.00  LookupCommand
 0.00     87.30     0.00        6     0.00     0.00  timer_out
 0.00     87.30     0.00         5     0.00     0.00  MatrixIdentity
 0.00     87.30     0.00        5     0.00     0.00  MatrixMult
 0.00     87.30     0.00        5     0.00     0.00  VerifyColorRange
 0.00     87.30     0.00        4     0.00     0.00  MatrixCopy
 0.00     87.30     0.00        2     0.00     0.00  MatrixInverse
 0.00     87.30     0.00        2     0.00     0.00  Translate
 0.00     87.30     0.00        1     0.00    17.24  BuildHierarchy_Uniform
 0.00     87.30     0.00        1     0.00     0.00  CreateViewMatrix
 0.00     87.30     0.00        1     0.00     0.00  Huniform_defaults
 0.00     87.30     0.00        1     0.00     0.00  InitDisplay
 0.00     87.30     0.00        1     0.00     0.00  InitEnv
 0.00     87.30     0.00        1     0.00     0.00  MatrixTranspose
 0.00     87.30     0.00        1     0.00     0.00  PrintStatistics
 0.00     87.30     0.00        1     0.00     0.00  ReadEnvFile
 0.00     87.30     0.00        1     0.00     0.00  Scale
 0.00     87.30     0.00        1     0.00     0.00  Vdc_buff_init
 0.00     87.30     0.00        1     0.00     0.00  Vdc_distribute
 0.00     87.30     0.00        1     0.00     0.00  Vdc_exit
 0.00     87.30     0.00        1     0.00     0.00  Vdc_free
 0.00     87.30     0.00        1     0.00     0.00  Vdc_segv_init
 0.00     87.30     0.00        1     0.00     0.00  Vdc_sigio_init
 0.00     87.30     0.00        1     0.00     0.11  Vdc_startup
 0.00     87.30     0.00        1     0.00     0.00  Vdc_timer_init
 0.00      87.30     0.00        1     0.00     0.00  Vdc_view_init
 0.00     87.30     0.00        1     0.00     0.00  ViewRotate
 0.00     87.30     0.00        1     0.00     0.00  analyze_args
 0.00     87.30     0.00        1     0.00     0.00  display_stat
 0.00     87.30     0.00        1     0.00     0.00  exit_handler
 0.00     87.30     0.00        1     0.00     0.00  init_mask_arrays
 0.00     87.30     0.00        1     0.00     0.00  init_masks
 0.00     87.30     0.00        1     0.00     0.09  init_memory
 0.00     87.30     0.00        1     0.00     0.02  init_network
 0.00     87.30     0.00        1     0.00     0.00  init_world_grid
 0.00     87.30     0.00        1     0.00     0.00  init_world_voxel
 0.00     87.30     0.00        1     0.00     0.00  ma_print
 0.00     87.30     0.00        1     0.00     0.00  powint
 0.00     87.30      0.00        1     0.00     0.00  read_infofile
 0.00     87.30     0.00        1     0.00     0.00  shm_free
 0.00     87.30     0.00        1     0.00     0.00  table_init
 0.00     87.30     0.00        1     0.00     0.00  test_localhost


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]