This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[patch] [gdbserver] Fix multi-GB error log files


Hi,

currently when one runs the (parallel) testsuite there are many gigabytes of
error messages:
	readchar: Got EOF
	Remote side has terminated connection.  GDBserver will reopen the connection.
	Can't bind address: Address already in use.
	Remote side has terminated connection.  GDBserver will reopen the connection.
	Can't bind address: Address already in use.
	[... 1020x]
	Remote side has terminated connection.  GDBserver will reopen the connection.
	Can't open socket: Too many open files.
	Remote side has terminated connection.  GDBserver will reopen the connection.
	Can't open socket: Too many open files.
	[... infinitely till some timeout occurs]

reproducible by:
	window A:
		./gdbserver :1234 ./gdbserver
	window B:
		while :;do ./gdbserver :1234 ./gdbserver; done
	window C:
		telnet localhost 1234
		^]q
	window B:
		-> Listening on port 1234
	window A:
		-> infinite loop

while regression testing it I found these FAILs but they probably happen
randomly and one should fix gdb_assert first anyway before tuning the
testsuite:
	linux-low.c:3584: A problem internal to GDBserver has been detected.
	Assertion `lwp->suspended >= 0' failed.
	FAIL: gdb.mi/mi-nonstop.exp: w0,i0 stop (timeout)
and
	linux-low.c:3584: A problem internal to GDBserver has been detected.
	Assertion `lwp->suspended >= 0' failed.
	=thread-group-exited,id="i1"
	&"Remote connection closed\n"
	~"thread.c:81: internal-error: inferior_thread: Assertion `tp' failed.\nA problem internal to GDB has been detected,\nfurther debugging may prove unreliable.\nQuit this debugging session? "
	~"(y or n) "
	FAIL: gdb.mi/mi-nsintrall.exp: stop 0 (timeout)
with a similar one for:
	FAIL: gdb.mi/mi-nsmoribund.exp: stop 0 (timeout)

There are multiple ways how to solve this problem.

One would be to change the code from
	[patch] testsuite: Fix multiple runs in parallel on a single host [Re: RFC: parallelize "make check"]
	http://sourceware.org/ml/gdb-patches/2009-07/msg00008.html
	commit c470d5dd46806b8e19925f58053923373b4c75d7
	Author: Jan Kratochvil <jan.kratochvil@redhat.com>
	* lib/gdbserver-support.exp (gdbserver_start): Loop spawning
	gdbserver increasing $portnum if "Can't bind address" has been seen.
to just choose random port number instead of trying to consecutively find
a first free port number which leads to races.  Still the problem would be
less racy but still racy (*).

(*) such as gdb.base/break-interp.exp relies now on that the address space
    randomization maps the executable on two consecutive runs on a different
    place.  Sometimes it does not.

This change introduces the listening port to remain open during the default
gdbserver run for users (when "--once" is not used).  This is a change where
the user can mistakenly connect to the "dead" but still listening port.
But I find the assumption that the port will be available when the user will
want to reconnect as a worse one than the dead-but-listening port.

Then gdbserver could also keep the listening port open for each run.
There would not be needed the gdbserver option --once and also no
$gdbserver_reconnect_p would be needed.  I do not mind much but I find the
whole testsuite gdbserver operation as casuing more headache with all the
listening-but-dead ports hanging around.  One should use --once when (s)he is
sure no reconnect will be needed even during regular use IMO.

Also when --once is already introduced it could be made default (that is
introduce an option "--permanent" to keep the current behavior).  But this
changes the behavior of gdbserver which the users may be used to.

Unfortunately it does not seem to be possible to have a port reserved (so that
no other program may bind+listen on it) while giving ECONNREFUSED.  Any
BACKLOG of listen() gives on Linux kernel an effective backlog of at least 3.

No regressions on {x86_64,x86_64-m32,i686}-fedora15-linux-gnu against default
gdbserver mode (**), except the cases described above.  Any possible
USE_WIN32API issues are untested.

(**) In fact against a patched gdbserver with sleep (1) there to avoid the
     multi-gigabyte log files difficult to handle for comparisons, that is
     with the first patch from:
	http://sourceware.org/ml/gdb-patches/2011-03/msg00330.html

I am not going to check it in without an approval.

The hook in gdb_init could be probably moved some way into
gdbserver-support.exp, if there is such a concern.


Thanks,
Jan


gdb/gdbserver/
2011-03-06  Jan Kratochvil  <jan.kratochvil@redhat.com>

	* remote-utils.c (handle_accept_event): Close LISTEN_DESC only if
	RUN_ONCE.  Comment for the LISTEN_DESC delete_file_handler call.
	(remote_prepare): New function with most of the TCP code from ...
	(remote_open): ... here.  Detect PORT here unconditionally.  Move also
	setting transport_is_reliable.
	* server.c (run_once): New variable.
	(gdbserver_usage): Document it.
	(main): Set run_once for `--once'.  Call remote_prepare.  Exit after
	the first run if RUN_ONCE.
	* server.h (run_once, remote_prepare): New declarations.

gdb/testsuite/
2011-03-06  Jan Kratochvil  <jan.kratochvil@redhat.com>

	* gdb.base/solib-disc.exp: Set gdbserver_reconnect_p.
	* lib/gdb.exp (gdb_init): Clear gdbserver_reconnect_p.
	* lib/gdbserver-support.exp (gdbserver_start): Add `--once' if
	!gdbserver_reconnect_p..
	(gdbserver_reconnect): Call error if !gdbserver_reconnect_p..

--- a/gdb/gdbserver/remote-utils.c
+++ b/gdb/gdbserver/remote-utils.c
@@ -170,14 +170,21 @@ handle_accept_event (int err, gdb_client_data client_data)
 	      (char *) &tmp, sizeof (tmp));
 
 #ifndef USE_WIN32API
-  close (listen_desc);		/* No longer need this */
-
   signal (SIGPIPE, SIG_IGN);	/* If we don't do this, then gdbserver simply
 				   exits when the remote side dies.  */
+#endif
+
+  if (run_once)
+    {
+#ifndef USE_WIN32API
+      close (listen_desc);		/* No longer need this */
 #else
-  closesocket (listen_desc);	/* No longer need this */
+      closesocket (listen_desc);	/* No longer need this */
 #endif
+    }
 
+  /* Even if !RUN_ONCE no longer notice new connections.  Still keep the
+     descriptor open for add_file_handler to wait for a new connection.  */
   delete_file_handler (listen_desc);
 
   /* Convert IP address to string.  */
@@ -200,6 +207,62 @@ handle_accept_event (int err, gdb_client_data client_data)
   return 0;
 }
 
+/* Prepare for a later connection to a remote debugger.
+   NAME is the filename used for communication.  */
+
+void
+remote_prepare (char *name)
+{
+  char *port_str;
+#ifdef USE_WIN32API
+  static int winsock_initialized;
+#endif
+  int port;
+  struct sockaddr_in sockaddr;
+  socklen_t tmp;
+  char *port_end;
+
+  port_str = strchr (name, ':');
+  if (port_str == NULL)
+    {
+      transport_is_reliable = 0;
+      return;
+    }
+
+  port = strtoul (port_str + 1, &port_end, 10);
+  if (port_str[1] == '\0' || *port_end != '\0')
+    fatal ("Bad port argument: %s", name);
+
+#ifdef USE_WIN32API
+  if (!winsock_initialized)
+    {
+      WSADATA wsad;
+
+      WSAStartup (MAKEWORD (1, 0), &wsad);
+      winsock_initialized = 1;
+    }
+#endif
+
+  listen_desc = socket (PF_INET, SOCK_STREAM, IPPROTO_TCP);
+  if (listen_desc == -1)
+    perror_with_name ("Can't open socket");
+
+  /* Allow rapid reuse of this port. */
+  tmp = 1;
+  setsockopt (listen_desc, SOL_SOCKET, SO_REUSEADDR, (char *) &tmp,
+	      sizeof (tmp));
+
+  sockaddr.sin_family = PF_INET;
+  sockaddr.sin_port = htons (port);
+  sockaddr.sin_addr.s_addr = INADDR_ANY;
+
+  if (bind (listen_desc, (struct sockaddr *) &sockaddr, sizeof (sockaddr))
+      || listen (listen_desc, 1))
+    perror_with_name ("Can't bind address");
+
+  transport_is_reliable = 1;
+}
+
 /* Open a connection to a remote debugger.
    NAME is the filename used for communication.  */
 
@@ -274,8 +337,6 @@ remote_open (char *name)
 
       fprintf (stderr, "Remote debugging using %s\n", name);
 
-      transport_is_reliable = 0;
-
       enable_async_notification (remote_desc);
 
       /* Register the event loop handler.  */
@@ -284,64 +345,22 @@ remote_open (char *name)
     }
   else
     {
-#ifdef USE_WIN32API
-      static int winsock_initialized;
-#endif
       int port;
+      socklen_t len;
       struct sockaddr_in sockaddr;
-      socklen_t tmp;
-      char *port_end;
 
-      port = strtoul (port_str + 1, &port_end, 10);
-      if (port_str[1] == '\0' || *port_end != '\0')
-	fatal ("Bad port argument: %s", name);
-
-#ifdef USE_WIN32API
-      if (!winsock_initialized)
-	{
-	  WSADATA wsad;
-
-	  WSAStartup (MAKEWORD (1, 0), &wsad);
-	  winsock_initialized = 1;
-	}
-#endif
-
-      listen_desc = socket (PF_INET, SOCK_STREAM, IPPROTO_TCP);
-      if (listen_desc == -1)
-	perror_with_name ("Can't open socket");
-
-      /* Allow rapid reuse of this port. */
-      tmp = 1;
-      setsockopt (listen_desc, SOL_SOCKET, SO_REUSEADDR, (char *) &tmp,
-		  sizeof (tmp));
-
-      sockaddr.sin_family = PF_INET;
-      sockaddr.sin_port = htons (port);
-      sockaddr.sin_addr.s_addr = INADDR_ANY;
-
-      if (bind (listen_desc, (struct sockaddr *) &sockaddr, sizeof (sockaddr))
-	  || listen (listen_desc, 1))
-	perror_with_name ("Can't bind address");
-
-      /* If port is zero, a random port will be selected, and the
-	 fprintf below needs to know what port was selected.  */
-      if (port == 0)
-	{
-	  socklen_t len = sizeof (sockaddr);
-	  if (getsockname (listen_desc,
-			   (struct sockaddr *) &sockaddr, &len) < 0
-	      || len < sizeof (sockaddr))
-	    perror_with_name ("Can't determine port");
-	  port = ntohs (sockaddr.sin_port);
-	}
+      len = sizeof (sockaddr);
+      if (getsockname (listen_desc,
+		       (struct sockaddr *) &sockaddr, &len) < 0
+	  || len < sizeof (sockaddr))
+	perror_with_name ("Can't determine port");
+      port = ntohs (sockaddr.sin_port);
 
       fprintf (stderr, "Listening on port %d\n", port);
       fflush (stderr);
 
       /* Register the event loop handler.  */
       add_file_handler (listen_desc, handle_accept_event, NULL);
-
-      transport_is_reliable = 1;
     }
 }
 
--- a/gdb/gdbserver/server.c
+++ b/gdb/gdbserver/server.c
@@ -40,6 +40,9 @@ static int extended_protocol;
 static int response_needed;
 static int exit_requested;
 
+/* --once: Exit after the first connection closed.  */
+int run_once;
+
 int multi_process;
 int non_stop;
 
@@ -2312,7 +2315,8 @@ gdbserver_usage (FILE *stream)
 	   "  --debug               Enable general debugging output.\n"
 	   "  --remote-debug        Enable remote protocol debugging output.\n"
 	   "  --version             Display version information and exit.\n"
-	   "  --wrapper WRAPPER --  Run WRAPPER to start new programs.\n");
+	   "  --wrapper WRAPPER --  Run WRAPPER to start new programs.\n"
+	   "  --once                Exit after the first connection closed.\n");
   if (REPORT_BUGS_TO[0] && stream == stdout)
     fprintf (stream, "Report bugs to \"%s\".\n", REPORT_BUGS_TO);
 }
@@ -2536,6 +2540,8 @@ main (int argc, char *argv[])
 		}
 	    }
 	}
+      else if (strcmp (*next_arg, "--once") == 0)
+	run_once = 1;
       else
 	{
 	  fprintf (stderr, "Unknown argument: %s\n", *next_arg);
@@ -2648,6 +2654,8 @@ main (int argc, char *argv[])
       exit (1);
     }
 
+  remote_prepare (port);
+
   while (1)
     {
       noack_mode = 0;
@@ -2676,7 +2684,7 @@ main (int argc, char *argv[])
 	 getpkt to fail; close the connection and reopen it at the
 	 top of the loop.  */
 
-      if (exit_requested)
+      if (exit_requested || run_once)
 	{
 	  detach_or_kill_for_exit ();
 	  exit (0);
--- a/gdb/gdbserver/server.h
+++ b/gdb/gdbserver/server.h
@@ -349,6 +349,7 @@ extern int disable_packet_Tthread;
 extern int disable_packet_qC;
 extern int disable_packet_qfThreadInfo;
 
+extern int run_once;
 extern int multi_process;
 extern int non_stop;
 
@@ -400,6 +401,7 @@ int putpkt (char *buf);
 int putpkt_binary (char *buf, int len);
 int putpkt_notif (char *buf);
 int getpkt (char *buf);
+void remote_prepare (char *name);
 void remote_open (char *name);
 void remote_close (void);
 void write_ok (char *buf);
--- a/gdb/testsuite/gdb.base/solib-disc.exp
+++ b/gdb/testsuite/gdb.base/solib-disc.exp
@@ -19,6 +19,7 @@ if {[skip_shlib_tests]} {
     return 0
 }
 
+set gdbserver_reconnect_p 1
 if { [info proc gdb_reconnect] == "" } {
     return 0
 }
--- a/gdb/testsuite/lib/gdb.exp
+++ b/gdb/testsuite/lib/gdb.exp
@@ -2799,6 +2799,11 @@ proc gdb_init { args } {
     setenv LC_ALL C
     setenv LANG C
 
+    # Clear $gdbserver_reconnect_p.
+    global gdbserver_reconnect_p
+    set gdbserver_reconnect_p 1
+    unset gdbserver_reconnect_p
+
     return [eval default_gdb_init $args];
 }
 
--- a/gdb/testsuite/lib/gdbserver-support.exp
+++ b/gdb/testsuite/lib/gdbserver-support.exp
@@ -218,10 +218,21 @@ proc gdbserver_start { options arguments } {
 
 	# Fire off the debug agent.
 	set gdbserver_command "$gdbserver"
+
+	# If gdbserver_reconnect will be called $gdbserver_reconnect_p must be
+	# set to true already during gdbserver_start.
+	global gdbserver_reconnect_p
+	if {![info exists gdbserver_reconnect_p] || !$gdbserver_reconnect_p} {
+	    # GDB client could accidentally connect to a stale server.
+	    append gdbserver_command " --once"
+	}
+
 	if { $options != "" } {
 	    append gdbserver_command " $options"
 	}
+
 	append gdbserver_command " :$portnum"
+
 	if { $arguments != "" } {
 	    append gdbserver_command " $arguments"
 	}
@@ -315,6 +326,12 @@ proc gdbserver_reconnect { } {
     global gdbserver_protocol
     global gdbserver_gdbport
 
+    global gdbserver_reconnect_p;
+    if {![info exists gdbserver_reconnect_p] || !$gdbserver_reconnect_p} {
+	error "gdbserver_reconnect_p is not set before gdbserver_reconnect"
+	return 0
+    }
+
     return [gdb_target_cmd $gdbserver_protocol $gdbserver_gdbport]
 }
 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]