This is the mail archive of the gdb@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [remote protocol] support for disabling packet acknowledgement


Sandra asked me to take a stab at explaining the mess we're in.
Some credit also goes to Nathan Sidwell, for hours spent diagramming
this and ramming it through my thick skull.

On Thu, Jul 10, 2008 at 04:13:20PM -0400, Paul Koning wrote:
> Let me see if I understand this right.
> 
> 1. +/- ACKs are fine for the clasis (without non-stop) remote
>    protocol.
> 
> 2. ACKs are needed if the underlying transport isn't a reliable 
>    transport (for example a raw UART).  They aren't needed if the
>    underlying transport is TCP or equivalent.
> 
> 3. +/- ACKs are not good enough for non-stop mode.  (It's not clear to
>    me why -- is it because there may need to be more than one packet
>    in flight?  An explanation of what exactly is wrong would be
>    helpful to understand how to fix the issue.)

The current GDB protocol has very simple state.  At any moment, it is
either GDB's turn or the remote's turn to send events.  Both sides
never simultaneously think they have the token.  Sometimes neither
side thinks they have the token - either when a message is on the
wire, or else when a message has been lost.  Normally a timeout comes
to the rescue.

Non-stop is incompatible with this.  GDB can have the normal protocol
token, for instance if it is about to send a memory read.  At the same
time the debug agent can send packets.  This has to be the case;
otherwise GDB would have to frequently poll for state changes, which
would introduce too much overhead and traffic.

The result of this is that the acks become ambiguous in the presence
of an unreliable or antagonistically delayed transport.  For instance,
if GDB sends a memory write, the stub acks it, the stub replies with
OK, and then GDB's ack is delayed.  Existing implementations of the
protocol will resend the OK in this case, assuming the message was
lost - from stub side that's indistinguishable from ack lost.  GDB's
long-delayed ACK arrives on the stub at the same time the OK arrives
at GDB.  GDB must ack again - it doesn't know whether the first ack
ever made it through, and if it doesn't ack now then the stub might
keep resending that OK until it gets through.  So now GDB sends an
ack.  Simultaneously the stub sends a stop reply indicating that some
other thread has stopped.  When it receives the ack, it thinks GDB saw
the stop reply and does not resend it.  But GDB hasn't seen it yet,
and if it is dropped the conversation is now out of sync.  GDB will
hang around waiting for an event that has already been reported.

There's a clear solution to this: sequence numbers.  There's a
convenient protocol which has them, too...

> The implication is that the non-stop mode design abandons support for
> non-TCP transports.

No, the design abandons support for non-stop operation on lossy
transports.  I've used plenty of serial and UDP links that were in
practice sufficient.  If the link level is not sufficient, then the
implementor still has the option of wrapping a more reliable layer
between the transport and the gdb protocol communication.

> I would argue you need to identify why +/- ACKs aren't good enough,
> and propose a replacement that is good enough.  With that
> replacement you have a way to add the non-stop mode.  If the
> overhead of that replacement is significant in some plausible use
> case, you could then add a way to turn it off for the case where TCP
> is used end to end.

I think that if someone wants to design a more reliable protocol than
the existing one, they are free to do so, and either layer it under
the existing protocol as described above or contribute it to GDB -
we're not leaving anyone out in the cold and a new feature doesn't
have to meet every possible use case in its first incarnation.

This isn't the only problem with the existing protocol in my opinion.
It's pretty crufty, but it gets by.

-- 
Daniel Jacobowitz
CodeSourcery


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]