[lia64-kernel] Re: clone and clone2

Tue Jun 20 13:56:00 GMT 2000

On Tue, Jun 20, 2000 at 01:43:14PM -0700, David Mosberger wrote:
> >>>>> On Tue, 20 Jun 2000 11:08:14 -0700, "H . J . Lu" <hjl@valinux.com> said:
> 
>   HJ> The current ia64 linuxthreads implementation in my ia64 glibc
>   HJ> uses clone2 instead of clone. What do we want to do with it? I'd
>   HJ> like to see a uniform clone interface for linuxthreads across
>   HJ> all cpus.
> 
> Me too.
> 
>   HJ> Any suggestions?
> 
> Here are two possiblities:
> 
> ---------------------
> (1) Obsolete the existing clone and replace it with a version that has
> the following interface:
> 
> 	int clone (unsigned long flags, unsigned long sp1, unsigned long sp2);
> 
> This is like the old clone, except for the addition of a second stack
> pointer.  On IA-64, this second stack pointer would be used to
> indicate the initial value for the register backing store.
> 
> ---------------------
> (2) Obsolete the existing clone and replace it with a version that has
> the following interface:
> 
> 	int clone (unsigned long flags,
> 		   unsigned long stack_base, unsigned long stack_size);
> 
> This is like the old clone, except for the addition of the
> "stack_size" argument.  Unlike in the old clone, "stack_base" always
> points to the lowest virtual address of that has been reserved for
> stack use.  The stack_size argument specifies the size of the virtual
> memory region reserved for the stack.
> 
> Both stack_base and stack_size can be 0:
> 
>  - If stack_base is zero, clone() will arrange for the child to use
>    the same stack pointer(s) as the parent.  This is meaningful only
>    if CLONE_VM is not specified.
> 
>  - If stack_size is zero, no memory has been reserved a priori for
>    stack use, i.e., the stack region is grown dynamically as needed.
>    Depending on architecture, the stack may grow towards lower virtual
>    addresses (e.g., x86), higher virtual addresses (e.g., PA-RISC), or
>    in both directions (e.g., IA-64).
> ---------------------
> 
> Unfortunately, both (1) and (2) have their problems.  (1) looks
> obviously IA-64 specific.  sp2 makes no sense on most existing
> architectures and why should there be two stack pointers?  Perhaps
> someone will come up with an architecture that requires three or no
> stack pointers at all and then we'd need to go back to the drawing
> board again.
> 
> (2) is much more architecture neutral.  Unfortunately, the
> stack_size==0 case still introduces some architecture dependency as
> the growth direction(s) will depend on the platform you're running on.
> However, disallowing this case would limit flexibility for IA-64.  The
> reason is that for IA-64 it probably would be ideal to initialize sp
> and bsp to the same value, letting the two stacks grow away from each
> other.  For this to work best, you'd want stack_base to point to the
> middle of a virtual page as that would reduce the number of
> stack-related page faults to one (for simple threads).
> 
> Overall, I find (1) unacceptable and (2) the best we've been able to
> come up so far.  Perhaps others on this list should think about this
> issue and see if the can punch any holes into solution (2).  If not
> (and nobody has a better suggestion), we should broaden the discussion
> to include non-IA-64 kernel developers and try implementing it.
> 
> 

There is another level of clone interface, which is used in linuxthreads.
I am more interested in that one. Right now, ia64 use

int clone2(int (*fn)(void *arg), void *thread_bottom, size_t stack_size, int flags, void *arg);

and the rest uses

int clone(int (*fn)(void *arg), void *thread_top, int flags, void *arg);

If we all use

int clone2(int (*fn)(void *arg), void *thread_bottom, size_t stack_size, int flags, void *arg);

that is one solution. All the other CPUs who want thread_top need to use
thread_bottom + stack_size to get thread_top in their clone.S.

-- 
H.J. Lu (hjl@gnu.org)