This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFA [PATCH] Implement 'catch syscall' for gdbserver


On Thursday, September 19 2013, Philippe Waroquiers wrote:

> On Thu, 2013-09-19 at 02:29 -0300, Sergio Durigan Junior wrote:

>> > 1. GDB native has a bug in the detection  of "syscall entry/return",
>> >    when the catch syscall is disabled when inferior is stopped
>> >    on a syscall entry, and then catch syscall is re-enabled later:
>> Hm, I remember seeing and trying to fix this bug a long time ago.  Not
>> sure if I managed to do it or not (guess not, since I didn't send the
>> patch, but I remember having something close to the fix).  My initial
>> idea was to record the event (syscall entry or return) inside GDB and
>> make it decide based on it.
>> 
>> Anyway, I agree it's a bug and that it deserves a bug report :-).  Could
>> you please file it?
> I stared at the problem when I detected it,
> but could not see any easy fix which would work properly
> in all circumstances e.g. enable/disable catch at various
> moments, including interrupting a process blocked in a syscall,
> enable catch syscall and have the syscall restarted
> (did not check the behaviour of ptrace, I guess we will directly
> have a syscall return event, and so GDB has to guess this is a return
> without having seen the entry).

Yes, I remember it wasn't an easy problem to solve, indeed.

> I will file a bug report (will analyse first the various problematic
> cases and maybe do a test to reproduce them).

Thanks for that, much appreciated.  Please add me in the Cc list of the
bug if you remember.

>
>> > RCS file: /cvs/src/src/gdb/remote.c,v
>> > +/* If 'QCatchSyscalls' is supported, tell the remote stub
>> > +   to report syscalls to GDB.  */
>> 
>> I generally prefer to explicitly say which method this function is
>> implementing.  Like, in this case, mention that it is implementing
>> "target_set_syscall_catchpoint".  It makes it easier to find the
>> definitions of each argument.
> No problem to do that, however it looks like all other functions
> in the same file are not commented like that.
> So, doing this would make this new function commented differently
> of all others.
> Maybe better to keep it consistent ?
> (and in another patch add for all functions a
>   "implements target_xxxxxxxx") ?

Yes, this is a recorrent problem in GDB.  What I prefer to do is to
ignore the wrong functions and make my functions in "The Right Way", but
I do realize it's a battle between consistency and correctness, so I
will not insist on it.

>> > +
>> > +static int
>> > +remote_set_syscall_catchpoint (int pid, int needed, int any_count,
>> > +			       int table_size, int *table)
>> > +{
>> > +  if (remote_protocol_packets[PACKET_QCatchSyscalls].support != PACKET_DISABLE)
>> > +    {
>> > +      char *catch_packet, *p;
>> > +      enum packet_result result;
>> > +      int n_sysno = 0;
>> > +
>> > +      if (needed && !any_count)
>> > +	{
>> > +	  int i;
>> > +
>> > +	  for (i = 0; i < table_size; i++)
>> > +	    if (table[i])
>> > +	      n_sysno++;
>> 
>> When would table[i] be 0 in this case?  I guess you could use table_size
>> directly without worrying about n_sysno.
> No, the 0 condition is to be checked.
> See following comment in target.h: 
>    TABLE is an array of ints, indexed by syscall number.  An element in
>    this array is nonzero if that syscall should be caught.  This argument
>    only matters if ANY_COUNT is zero.
>
> I added a comment:
>       /* Count how many syscalls are to be caught (table[sysno] != 0).  */

You are right, of course.  Sorry about the confusion, this interface
messed with my brain.

>> > +	}
>> > +
>> > +      if (remote_debug)
>> > +	fprintf_unfiltered (gdb_stdlog,
>> > +			    "remote_set_syscall_catchpoint "
>> > +			    "pid %d needed %d any_count %d n_sysno %d\n",
>> > +			    pid, needed, any_count, n_sysno);
>> > +      if (needed)
>> > +	{
>> > +	  /* Prepare a packet with the sysno list, assuming
>> > +	     max 8+1 characters for a sysno. If the resulting
>> > +	     packet size is too big, fallback on the non
>> > +	     selective packet.  */
>> > +	  const char *q1 = "QCatchSyscalls:1";
>> 
>> Really nitpicking but you could avoid calling strlen for q1 and replace
>> it by a size_t variable holding the sizeof (q1) - 1.  But this is just
>> a minor thing.
>
> I cannot use sizeof(q1) but I could possibly do
>    const size_t q1len = strlen (q1);
> and use q1len after ?
> (that would however only spare a (very unlikely) second call
> to strlen, probably optimised by the compiler in any case).

As I said, just nitpicking :-).  I prefer to use sizeof and store the
result in a variable, but that's a matter of style.  I guess it's fine
as is.

>
>> > +	      p = catch_packet;
>> > +	      p += strlen(p);
>> > +	      for (i = 0; i < table_size; i++)
>> > +		{
>> > +		  if (table[i])
>> 
>> Again, I don't see the reason of this check.
> See above.
> Added a comment:
>        /* Add in catch_packet each syscall to be caught (table[i] != 0).  */

Thanks again.

>> 
>> > +		    {
>> > +		      xsnprintf(p, catch_packet + maxpktsz - p,
>> > +				";%x", i);
>> > +		      p += strlen(p);
>> > +		    }
>> > +		}
>> > +	    }
>> > +	  if (strlen(catch_packet) > get_remote_packet_size())
>> > +	    {
>> > +	      /* catch_packet too big. Fallback to less efficient
>> > +		 non selective mode, with GDB doing the filtering.  */
>> > +	      catch_packet[strlen (q1)] = 0;
>> > +	    }
>> 
>> Hm, can't you do this check before start filling catch_packet?  This
>> would save some time.
> Not easy to do (and saving probably very little cpu time in very
> unfrequent cases):
> The size of the packet depends on which syscalls
> are added, and not only on the nr of syscalls to add in the packet.
> E.g. a sysno can give either 2 or 3 or 4 characters (if we have
> sysno only below 999). 

OK, nevermind then.

>> > -  /* Check if the target supports PTRACE_O_TRACESYSGOOD.  */
>> > +  /* Check if the target supports PTRACE_O_TRACESYSGOOD, keeping
>> > +     PTRACE_O_TRACEFORK option activated.  */
>> >    ret = ptrace (PTRACE_SETOPTIONS, child_pid, (PTRACE_TYPE_ARG3) 0,
>> > -		(PTRACE_TYPE_ARG4) PTRACE_O_TRACESYSGOOD);
>> > +		(PTRACE_TYPE_ARG4) (PTRACE_O_TRACEFORK
>> > +				    | PTRACE_O_TRACESYSGOOD));
>> 
>> Any reason why you are keeping PTRACE_O_TRACEFORK set?
> Changed the comment to:
>   /* Check if the target supports PTRACE_O_TRACESYSGOOD.
>      We keep PTRACE_O_TRACEFORK option activated as a fork
>      event notification is expected after my_waitpid below.  */

Oh, OK, now it makes sense, I guess I stopped reading the code here.
Thanks for the comment.

>> > RCS file: /cvs/src/src/gdb/common/linux-ptrace.c,v
>> >    if (ret == 0)
>> >      current_ptrace_options |= PTRACE_O_TRACESYSGOOD;
>> >  
>> > +#ifdef GDBSERVER
>> > +  /* gdbserver does not support PTRACE_O_TRACEVFORKDONE yet.  */
>> > +#else
>> >    /* Check if the target supports PTRACE_O_TRACEVFORKDONE.  */
>> >    ret = ptrace (PTRACE_SETOPTIONS, child_pid, (PTRACE_TYPE_ARG3) 0,
>> >  		(PTRACE_TYPE_ARG4) (PTRACE_O_TRACEFORK
>> > @@ -474,6 +475,11 @@ linux_enable_event_reporting (pid_t pid)
>> >  static int
>> >  ptrace_supports_feature (int ptrace_options)
>> >  {
>> > +  /* Check if we have initialized the ptrace features for this
>> > +     target.  If not, do it now.  */
>> > +  if (current_ptrace_options == -1)
>> > +    linux_check_ptrace_features ();
>> > +
>> 
>> This seems correct to me, but I think you should send it in a separate
>> patch, because it is not specific to catch syscall IMO.
> It looks better to me to keep it in this patch, as it was not needed
> before, but is now needed due to the "early calls" to
> ptrace_supports_feature needed for catch syscalls in gdbserver.

Well, it just seemed to me that doing this check and calling
linux_check_ptrace_features is the right thing to do, with or without
catch syscall.  I.e., it seems that someone forgot to make this check
here, and you're fixing this.  Anyway, leave it as is, let's wait until
someone more experienced comments about it.

>> > ===================================================================
>> > RCS file: /cvs/src/src/gdb/gdbserver/linux-low.c,v
>
>> > +static void
>> > +get_syscall_trapinfo (struct lwp_info *lwp, int *sysno, int *sysret)
>> > +{
>> > +  struct thread_info *saved_inferior;
>> > +  struct regcache *regcache;
>> > +
>> > +  if (the_low_target.get_syscall_trapinfo == NULL)
>> > +    {
>> > +      *sysno = 0;
>> > +      *sysret = 0;
>> 
>> Aren't you missing a "return" here?
> Yes, added it.
>> 
>> In fact, is there any way for "the_low_target.get_syscall_trapinfo" to
>> be NULL?  It seems to me that this is a necessary condition for this
>> function to be called (see my comment about putting an assert at
>> server.c:handle_general_set).  So maybe you should put an assert here to
>> check that the_low_target.get_syscall_trapinfo != NULL.  WDYT?
> Good question, I am not sure what to do.
> It looks like many  other calls to low level target
> are protected similarly (which is not a good enough reason).
> However, in this case, I suspect that if the GDB users forces
> the use of the QCatchSyscalls packet, that we might end up there:
> GDB will send such a packet, resulting in server.c setting
> catching_syscalls_p to 1, resulting to ptrace "syscall" to be activated
> resulting in a syscall trap event resulting in a call to getsyscall
> trapinfo.
> So, returning sysno 0 looks safe to me, to avoid a GDB user
> causing an assert due to forcing the use of a (normally)
> auto detected packet.

If the user can indeed force GDB to use QCatchSyscalls, then I agree
that the assert is not the right solution.  I.e., things are fine as is.

>> > +    return 0;
>> > +
>> > +  if (catched_syscalls_size == 0)
>> > +    return 1;
>> > +
>> > +  get_syscall_trapinfo (event_child, &sysno, &sysret);
>> > +  for (i = 0; i < catched_syscalls_size; i++)
>> > +    if (catched_syscalls[i] == sysno)
>> > +      return 1;
>> 
>> Also, I think the names of the variables are a bit misleading.
>> "catched_syscalls" does not represent syscalls that were caught, but
>> rather syscalls that are to be caught.  Maybe rename them to
>> "syscalls_to_be_caught" or something?
> Yes.
> Renamed to syscalls_to_catch_size and syscalls_to_catch.
> grep-ped for frenglish 'catched' to ensure I catched all errors :).

Hahaha, I swear I was going to correct you again, but then I realized it
is a joke :-P.

>> > Index: gdbserver/remote-utils.c
>> > ===================================================================
>> > RCS file: /cvs/src/src/gdb/gdbserver/remote-utils.c,v
>> > retrieving revision 1.98
>> > diff -u -p -r1.98 remote-utils.c
>> > --- gdbserver/remote-utils.c	1 Jul 2013 11:19:27 -0000	1.98
>> > +++ gdbserver/remote-utils.c	30 Aug 2013 15:18:16 -0000
>> > @@ -1319,12 +1319,17 @@ prepare_resume_reply (char *buf, ptid_t 
>> >    switch (status->kind)
>> >      {
>> >      case TARGET_WAITKIND_STOPPED:
>> > +    case TARGET_WAITKIND_SYSCALL_ENTRY:
>> > +    case TARGET_WAITKIND_SYSCALL_RETURN:
>> >        {
>> >  	struct thread_info *saved_inferior;
>> >  	const char **regp;
>> >  	struct regcache *regcache;
>> >  
>> > -	sprintf (buf, "T%02x", status->value.sig);
>> > +	if (status->kind == TARGET_WAITKIND_STOPPED)
>> > +	  sprintf (buf, "T%02x", status->value.sig);
>> > +	else
>> > +	  sprintf (buf, "T%02x", SIGTRAP);
>> 
>> Again, maybe it's just me, but would you mind putting a comment here
>> explainig why you're using SIGTRAP directly if status->kind ==
>> TARGET_WAITKIND_SYSCALL_{ENTRY,RETURN} ?
> This is according to the protocol definition:
>         * If N is a recognized "stop reason", it describes a more
>           specific event that stopped the target.  The currently
>           defined stop reasons are listed below.  AA should be `05',
>           the trap signal.  At most one stop reason should be present.

Right, one more thing I learned :-).

>> >  	  {
>> > Index: gdbserver/server.c
>> > ===================================================================
>> > RCS file: /cvs/src/src/gdb/gdbserver/server.c,v
>> > retrieving revision 1.196
>> > diff -u -p -r1.196 server.c
>> > --- gdbserver/server.c	28 Aug 2013 17:40:58 -0000	1.196
>> > +++ gdbserver/server.c	30 Aug 2013 15:18:17 -0000
>> > @@ -71,6 +71,9 @@ int debug_threads;
>> >  int debug_hw_points;
>> >  
>> >  int pass_signals[GDB_SIGNAL_LAST];
>> > +int catch_syscalls_p;
>> > +int catched_syscalls_size;
>> 
>> size_t?  :-)
> I guess an int will be ok unless we reach zillions
> of different syscalls, and then the protocol packet
> decoding code will suffer a lot :).

Fair enough.  I can't avoid suggesting such things, sorry :-).

>> I am not very familiar with the remote protocol, but IMHO you could do
>> an assert here to check if target_supports_catch_syscall returns 1,
>> WDYT?  IOW, we shouldn't get here if it returns 0, right?
>> 
>> Or is there some sort of mechanism intended to overwrite the initial
>> probing of features and force unsupported features to be enabled?  In
>> that case, I guess the assert would trigger every time...  Not sure what
>> to do.
> Yes, I suspect this is true (so has not added asserts).
> It is however not very clear to me why GDB allows to overwrite
> auto-detected packets, as this might effectively give
> unexpected results. But it looks better to not crash gdbserver
> following a "legal" user action.

Agreed.

>> > Index: testsuite/gdb.base/catch-syscall.exp
>> >
>> > +# This shall be updated whenever QCatchSyscalls packet support is implemented
>> > +# on some gdbserver architecture.
>> > +if { [is_remote target]
>> > +     && ![istarget "x86_64-*-linux*"] 
>> > +     && ![istarget "i\[34567\]86-*-linux*"] } {
>> > +    continue
>> > +}
>> 
>> Hm, am I too sleepy or this test does nothing at all?  :-).
> We both suspect we are too sleepy :).
> Looks to me the above is the expected condition, which is to "continue
> somewhere else" (whatever that "continue" exactly means in tcl/dejagnu,
> I have no idea :).
> This condition looks similar to equivalent conditions just before,
> that are skipping this test on non matching target.
> But my tcl/dejagnu knowledge is close to 0, so I might have missed
> something.

Yeah, you're right again :-).  My TCL-fu is not strong, as you might
have noticed.  I just tested here (should have done this before), and
"continue" is indeed just a way to say "stop testing this and go to the
next testcase".  So everything's right.

>> Otherwise, the patch looks pretty good to me!  I have tested it here and
>> as far as I have seen it is working OK.  I hope my comments help you
>> with getting the patch approved.  I am not a maintainer, so you will
>> need to wait for one to review the patch as well :-).
> Thanks for the extensive and helpful comments.
> I will retest and submit another patch version (I guess this WE).
> Any additional comment or feedback welcome in the meantime ...

Thank you.

-- 
Sergio


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]