This is the mail archive of the
gdb@sources.redhat.com
mailing list for the GDB project.
Re: [RFC] New gdb command 'gcore'
- From: James Cownie <jcownie at etnus dot com>
- To: Michael Snyder <msnyder at cygnus dot com>
- Cc: gdb at sources dot redhat dot com
- Date: Thu, 13 Dec 2001 13:55:53 +0000
- Subject: Re: [RFC] New gdb command 'gcore'
- Reply-to: James Cownie <jcownie at etnus dot com>
> The holy grail, of course, would be to then give gdb the ability to
> restart the process from the core file state. That would give us a
> checkpoint-and-restart capability that very few debuggers have ever
> had. But that's down the line...
Unfortunately in general a normal core file does not contain enough
information to allow a process to be restarted, since it doesn't
contain a lot of the information in the kernel which forms part of the
process' state.
There are many "fun" issues which arise when trying to implement
checkpointing, such as
1) Open files. What fds ar open ? What's the seek position of each ?
What about pipes ?
2) process id; does it change between the original process and its
reincarnation ?)
3) parent process id (same question).
4) relationship with child processes (if any). Do you checkpoint the
whole process group ?
5) network connections. Can you reconstruct them ? What about the
state of the other end ?
6) time. When the process is reincarnated does it see time passing
while it was only a checkpoint ?
7) signal handling state. What signal handlers are set up ? What
signals are blocked ?
8) State of any timers. Suppose a thread was in a sleep() when should
the sleep complete ?
9) State of other potentially long system calls. A listen(), for
instance, or a read from something which isn't ready.
10) All the other things which didn't come to mind in the three minutes
it's taken to type this.
Of course it's possible to add restrictions to the state a process
must be in before it can be checkpointed, unfortunately if you want to
do the checkpoint from gdb it's going to be hard to know if the
restrictions are valid, since you can arbitrarily invoke gcore between
any two machine instructions.
It's a nice idea, but I think it's hard :-( (and to do it portably is
_very_ hard).
-- Jim
James Cownie <jcownie@etnus.com>
Etnus, LLC. +44 117 9071438
http://www.etnus.com