This is the mail archive of the
binutils@sources.redhat.com
mailing list for the binutils project.
Re: [PATCH, arm] Thumb shared library support: Thumb PLT, etc.
- From: Richard Earnshaw <rearnsha at arm dot com>
- To: Adam Nemet <anemet at Lnxw dot COM>
- Cc: binutils at sources dot redhat dot com, Richard dot Earnshaw at arm dot com
- Date: Thu, 18 Jul 2002 11:10:38 +0100
- Subject: Re: [PATCH, arm] Thumb shared library support: Thumb PLT, etc.
- Organization: ARM Ltd.
- Reply-to: Richard dot Earnshaw at arm dot com
> Hi,
>
> This is the second piece of a series of patches to make shared
> libraries work on Thumb. The first (yet unreviewed) patch was a GCC
> patch: http://gcc.gnu.org/ml/gcc-patches/2002-07/msg00398.html .
>
> This patch among other things adds a new switch --thumb-plt to
> generate Thumb PLT on ARM ELF. My third patch will handle the
> interworking aspects of this new flag.
>
> I ran the GCC testsuite with all combinations of -marm/-mthumb and
> /-fPIC. I also did lots of manual testing especially on the lazy
> relocation part as our (LynxOS) ld.so does not support that. A
> slightly different form of this patch has been part of our ld (2.10.1)
> for sometime now and underwent extensive testing.
>
> Please apply if OK.
> Adam
Hmm, there's some useful stuff in here, but I don't think it's quite right
yet.
First of all, do you have a copyright assignment in place for binutils
(and gcc, for your other patch)? Until that's sorted out we can't use
your code.
Now for some comments.
1) I don't like the idea of having some special flag (--thumb-plt) that
indicates that we should build a different type of PLT. The linker must
be able to figure this out automatically, or we will end up with major
problems when it comes to interworking.
>
> + static const insn16 elf32_thumb_plt0_entry [THUMB_PLT_ENTRY_SIZE / 2] =
> + {
> + 0xb500, /* push {lr} */
> + 0xb082, /* sub sp, #8 */
> + 0x9000, /* str r0, [sp] */
> + 0x4807, /* ldr r0, [pc, #28] */
> + 0x300c, /* add r0, #12 */
> + 0x4478, /* add r0, pc */
> + 0x4686, /* mov lr, r0 */
> + 0x6800, /* ldr r0, [r0] */
> + 0x9001, /* str r0, [sp, #4] */
> + 0xbd01 /* pop {r0, pc} */
> + };
> +
Hmm, would you really want the entry point in your dynamic linker to be
Thumb code? The first thing it has to do is stack every register, so i'd
have thought a switch to ARM state in the stub would be in order. See
also the comments below re sharing with ARM code.
> /* Subsequent entries in a procedure linkage table look like
> this. */
> ! static const bfd_vma elf32_arm_plt_entry [ARM_PLT_ENTRY_SIZE / 4] =
> {
> 0xe59fc004, /* ldr ip, [pc, #4] */
> 0xe08fc00c, /* add ip, pc, ip */
> 0xe59cf000, /* ldr pc, [ip] */
> 0x00000000 /* offset to symbol in got */
> };
> +
> + /* Note that on ARMv5 and above unlike the ARM PLT entries, the Thumb
> + entry can switch mode depending on the corresponding address in the
> + GOT. The dynamic linker should set or clear the last bit of the
> + address in the GOT accordingly. */
> +
> + static const insn16 elf32_thumb_plt_entry [THUMB_PLT_ENTRY_SIZE / 2] =
> + {
> + 0xb082, /* sub sp, #8 */
> + 0x9000, /* str r0, [sp] */
> + 0x4802, /* ldr r0, [pc, #8] */
> + 0x4478, /* add r0, pc */
> + 0x4684, /* mov ip, r0 */
> + 0x6800, /* ldr r0, [r0] */
> + 0x9001, /* str r0, [sp, #4] */
> + 0xbd01, /* pop {r0, pc} */
> + 0x0000, /* offset to symbol in got */
> + 0x0000
> + };
>
We need more space for the thumb sequence than we do for an ARM one. That
suggests that we should probably be looking to switch to ARM code for the
stub. For example, we could use
.code 16
.align 2
_plt_stub_thumb:
bx pc
nop
.code 32
_plt_stub_arm:
ldr ip, [pc, #8]
add ip, pc, ip
ldr ip, [ip]
bx ip
.word offset_to_target
which means we can share the stub with both ARM and Thumb code. So while
this is now 6 words long we save on duplication, and we have interworking
from the start.
Further, if we know we are targeting v5 we can adjust the caller to use
blx so that it jumps straight to the ARM entry point, and since loading
the pc from memory will cause an arm<->thumb transition if required then
we can revert to the more efficient sequence that is the default ARM case.
I think most of the rest of the changes are just a natural consequence of
how we design the PLT stubs, so I'm not going to concentrate on that until
we have the above sorted out.
R.