This is the mail archive of the gdb-patches@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Fear and loathing in lookup_block_symbol


This is a little long and involved, but there is a big problem in one of 
the lowest level symbol lookup functions, so (someone) please bear with 
it all.

Daniel put together some great C++ symbol lookups last summer
	http://sources.redhat.com/ml/gdb-patches/2000-08/msg00199.html

Real problems - I'd just started digging into #3 on his set of patches 
myself.
Elena looks at the patch a little while longer, raises some issues (like 
why the namespace check is no longer done in lookup_block_symbol - good 
catch Elena):
	http://sources.redhat.com/ml/gdb-patches/2000-09/msg00237.html

There is some back and forth discussion.  Elena commits some of the 
changes (all of them?  I haven't read all the archives that thoroughly, 
and it's not relevant)
	http://sources.redhat.com/ml/gdb-cvs/2000-10/msg00014.html

Incidentally, Daniel (tisk tisk :) had assured her that the namespace 
check wasn't necessary before the commit happened.  There is some more 
back-and-forth on these patches -- Daniel wasn't especially thorough in 
updating all callers of lookup_block_symbol, but Peter Schauer jumps in 
and fixes several of these.  Daniel says he'll just yank the whole lot 
of them and submits a patch to revert it all, but I don't think that 
ever happened (again, my goal isn't accurate historical 
representation -- there's a bug to be fixed here. :-)

Side bar - we see gdb/15 pop up and Michael Chastain chase it down
	http://sources.redhat.com/ml/gdb-patches/2001-01/msg00353.html
It's because the namespace check is no longer being done.  Michael is 
mystified
as to how this happened.

OK OK, I'll get to the real bug.  The point is that these patches have a 
long and seedy history. :-)  And at the same time, I want to throw out 
props to Daniel -- despite everything, he is getting at real, BIG, gdb 
inefficiencies, and he's trying to fix them.  A little more follow-up 
and completeness would have been good, that's all.

In the patch that Elena committed (URL above), which makes C++ symbol 
lookup use the binary search method has a mistake.  lookup_block_symbol 
has two search methods.  One is a binary search; it gets "close" to the 
symbol (alphabetically 'below' the correct entry) and then searches over 
the next few symbols.  The other method is a carefully written linear 
search.  In the binary search method, this linear-stepping second half 
is necessary because we might have multiple symbol names in different 
namespaces (e.g. function vs struct namespaces).

Daniel's patch inadvertently changes this second half of the binary 
search method -- instead of stepping over a few symbols, gdb now binary 
searches to the middle of the block symbols, then linearly steps over 
all symbols until it gets to the end of the block.  This is very 
expensive.  Specifically, I'm talking about this patch here:

@@ -1247,19 +1282,8 @@ lookup_block_symbol (register const stru
        while (bot < top)
         {
           sym = BLOCK_SYM (block, bot);
-         inc = SYMBOL_NAME (sym)[0] - name[0];
-         if (inc == 0)
-           {
-             inc = STRCMP (SYMBOL_NAME (sym), name);
-           }
-         if (inc == 0 && SYMBOL_NAMESPACE (sym) == namespace)
-           {
-             return (sym);
-           }
-         if (inc > 0)
-           {
-             break;
-           }
+         if (SYMBOL_MATCHES_NAME (sym, name))
+           return sym;
           bot++;
         }
      }

The old code is straightforward - the binary search has put us at the 
beginning of matching symbols (the 'bot' index). If the symbol names 
match, and the namespaces match, we return the sym.  If we've stepped 
beyond the block symbols that could match our symbol name, we break out 
of the loop.  Daniel's change loses this critical "break out of the 
loop" part, and iterates over all the block symbols from the 
binary-search start point until the end.

We can't copy the old code's style exactly.  SYMBOL_MATCHES_NAME calls 
strcmp_iw() on the demangled names (to ignore whitespace), but strcmp_iw 
is poorly named; it doesn't give you a less-than greater-than return 
value like strcmp().  It'd be better termed streq_iw() or something 
unlike strcmp().  But we can make vast improvements without digging into 
this problem by changing the code to something like:

       while (bot < top)
         {
           sym = BLOCK_SYM (block, bot);
           if (SYMBOL_NAMESPACE (sym) == namespace &&
               SYMBOL_MATCHES_NAME (sym, name))
             {
               return sym;
             }
           if (SYMBOL_SOURCE_NAME (sym)[0] > name[0])
             {
               break;
             }
           bot++;
         }

The search is not optimal (we'll probably end up stepping over more 
symbols than is necessary), but it's a huge improvement over the current 
one.

A specific example:  One of our demo programs that we ship with the 
development tools is a little notepad GUI app.  It has these opaque 
structure pointers that I mentioned last week, which gdb will never find 
a definition of.  These opaque structure pointers are often in programs 
as local variables, so stepping with a Locals window open highlights any 
delay here.  With the old iterate-over-half-the-symbols approach, "maint 
time 1" reported 1.6 - 1.7 seconds to look grope through all the symbols 
looking for this structure.  With this break-out clause added, "maint 
time 1" reports 0.0 seconds (it's below the measurement threshold).    
(and before that second call to check_typedef() was removed, doing "info 
locals" would take around 3.5 seconds.)

If someone will agree that this approach is reasonable, I'll check in a 
patch.  I've already run the above patch through the testsuite, it adds 
no new regressions.

Jason

PS-  This patch was brought to you by Tom Tromey's gdb-profile patch.  
Yes, gdb-profile, it does the things a profiling patch ought to do.  I 
fired up his patch on gdb, found this naughty little 
lookup_block_symbol() function in about one minute, and started digging 
in.  Very helpful.  Don't you wish YOU could use Tom Tromey's 
gdb-profile patch?


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]