This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Patch] Gas support for MIPS Compact EH


"Moore, Catherine" <Catherine_Moore@mentor.com> writes:
>> -----Original Message-----
>> From: Richard Sandiford [mailto:rdsandiford@googlemail.com]
>> Sent: Thursday, February 20, 2014 6:37 AM
>> To: Moore, Catherine
>> Cc: Schmidt, Bernd; binutils@sourceware.org
>> Subject: Re: [Patch] Gas support for MIPS Compact EH
>> 
>> "Moore, Catherine" <Catherine_Moore@mentor.com> writes:
>> >> -----Original Message-----
>> >> From: Richard Sandiford [mailto:rdsandiford@googlemail.com]
>> >> Sent: Sunday, February 09, 2014 6:11 AM
>> >> To: Schmidt, Bernd
>> >> Cc: Moore, Catherine; binutils@sourceware.org
>> >> Subject: Re: [Patch] Gas support for MIPS Compact EH
>> >>
>> >> Bernd Schmidt <bernds@codesourcery.com> writes:
>> >> > On 02/08/2014 05:34 PM, Richard Sandiford wrote:
>> >> >>>> The tc_cfi_fix_eh_ref and tc_cfi_emit_expr hooks don't seem very
>> >> >>>> consistent; the former relies on the caller to clear the bytes,
>> >> >>>> whereas the latter is supposed to do it itself.
>> >> >>>
>> >> >>> All fixed, now using a hook to return a reloc and eliminated the
>> >> >>> use of R_MIPS_EH from the assembler.
>> >> >>
>> >> >> Hmm, but how does it work under the new scheme?  It looks like gas
>> >> >> now always emits the .eh_frame_entry sections using R_MIPS_PC32,
>> >> >> is
>> >> that right?
>> >> >> But the linker chooses the .eh_frame_hdr encoding based on
>> >> >> --pcrel-eh-reloc, which also controls how R_MIPS_EH is handled.
>> >> >> So if
>> >> the:
>> >> >>
>> >> >>    DW_EH_PE_sdata4 | DW_EH_PE_datarel | DW_EH_PE_indirect
>> >> >>
>> >> >> encoding is chosen for the .eh_frame_entry sections at link time,
>> >> >> what converts the input sections to use that encoding instead of
>> >> >> the original R_MIPS_PC32?  I'd assumed R_MIPS_EH was defined the
>> >> >> way it was to avoid that kind of thing.
>> >> >
>> >> > What's changed is that the linker is no longer really involved in
>> >> > these decisions - the code you see in the linker-specific parts of
>> >> > the patch are there merely to deal with R_MIPS_EH relocs in object
>> >> > files generated by previous toolchains. We've always kind of
>> >> > already decided at compile time which encoding to use and passed
>> >> > the -pcrel-eh-reloc option to the linker to ensure it made the
>> >> > choice we wanted. What's new in this version of the patches is
>> >> > based on the realization that gcc can produce
>> >> > datarel|indirect encoding without linker help (using the new
>> >> > datarel|forcegpword
>> >> > op).
>> >>
>> >> Ah, OK, so it's now up to the assembler writer to do the indirection by
>> hand?
>> >> I hadn't realised that.  (To be fair, the patch added the
>> >> .forcegpword directive but didn't have any examples or tests to show
>> >> how it was used, or any documentation explaining it.  I deliberately
>> >> didn't complain about the latter though because most ops are
>> >> undocumented.)
>> >>
>> >> Using GP-relative is probably a bad idea for MIPS because the GP base
>> >> can vary in the case of multigot.  When resolving the relocations in
>> >> the individual input sections, the GP used will be for that input
>> >> bfd's GOT, which isn't necessarily going to be the primary GOT.  So
>> >> if we're doing the indirection by hand, why not use pcrel|indirect
>> >> instead?  I suppose that amounts to using .ehword (under the new
>> >> PC-relative definition) rather than .forcegpword.  I think it'd be
>> >> better to drop .forcegpword if possible.
>> >>
>> >> FWIW, pcrel|indirect is what the linker tries to use for .eh_frame on
>> >> MIPS, if the original input object used absolute indirect.
>> >>
>> >> > On Linux targets you'll see this generated by the compiler, on
>> >> > bare-metal you'll get R_MIPS_PC32. The R_MIPS_EH reloc is longer
>> >> > produced. So it's all a lot more straightforward, directly
>> >> > producing an encoding appropriate for the target at compile time.
>> >> >
>> >> >> I thought R_MIPS_EH would be used for the .eh_frame_entry entries
>> >> >> only, since in that case the actual encoding of the address isn't
>> >> >> known until link time.
>> >> >
>> >> > I think previous versions of the code were just slightly confused -
>> >> > the idea was that datarel|indirect required things to be put into
>> >> > the got, which has to be done by the linker. It turns out that this
>> >> > isn't necessary, so there is no longer a need to use R_MIPS_EH.
>> >> >
>> >> > Does this clarify things?
>> >>
>> >> I don't think it really answers my first question.  It's still the
>> >> linker that decides what encoding goes in the .eh_frame_hdr.
>> >> Then all the following .eh_frame_entry sections must use that
>> >> encoding for the text addresses.
>> >>
>> >> The input objects use relocations to mark those .eh_frame_entry text
>> >> addresses.  In the original scheme those relocations were R_MIPS_EH,
>> >> giving the linker control of both the .eh_frame_hdr encoding and the
>> >> .eh_frame_entry fields that that encoding controls.  That part seemed
>> >> consistent and safe.  (The unsafe part was that other parts of the
>> >> assembler seemed to assume that R_MIPS_EH always meant
>> >> datarel-indirect.)
>> >>
>> >> In the new scheme the assembler picks an encoding-specific relocation
>> >> for those .eh_frame_entry text addresses (always PC-relative in the
>> >> posted patch), but the linker is still the one that chooses the
>> >> .eh_frame_hdr encoding.  And in the posted patch the encoding used in
>> >> .eh_frame_hdr is decided by the --pcrel-eh-reloc option.  The default
>> >> linker behaviour is to use datarel-indirect, which is the opposite of
>> >> what the assembler now uses.  So if I do:
>> >>
>> >>     as foo.s -o foo.o
>> >>     ld foo.o -o foo
>> >>
>> >> it looked like foo.o would use a PC-relative encoding for any
>> >> .eh_frame_entry sections, but foo's .eh_frame_hdr would say that they
>> >> are datarel-indirect rather than PC-relative.
>> >>
>> >> In other words, I think using datarel-indirect for the .eh_frame_hdr
>> >> is only safe if all input objects are using the old scheme.  So that
>> >> seems like a bad default.  And I'm not sure --pcrel-eh-reloc is safe
>> >> for old objects because of the assumption in some cases that
>> >> R_MIPS_EH would generate datarel- indirect.
>> >>
>> >> Maybe for the FSF version we should just drop the R_MIPS_EH stuff
>> >> altogether, since at this point it sounds like it would be safer to
>> >> reject the old objects than to try to guess what's happening.
>> >> The .eh_frame_hdr could then be hard-coded to use a PC-relative
>> >> encoding, like the assembler is now.  It seems a bit of a shame in a
>> >> way though -- and like I say, using PC-relative wouldn't apply well
>> >> to targets like VxWorks where the gap between the text and data
>> >> segments isn't fixed at link time.
>> >>
>> >
>> > Dropping the R_MIPS_EH stuff is not a desirable option for us.
>> > If we were to go back to the original implementation, using R_MIPS_EH
>> > for entries in the .eh_frame_entry sections, then your objection to
>> > the implementation was the inconsistent treatment in the assembler?
>> 
>> Yes, or more specifically: the assembler was sometimes associating
>> R_MIPS_EH with a specific encoding type, even though we don't know at
>> assembly time what encoding is associated with R_MIPS_EH.
>> 
>> > You said:
>> > RS>  Or, to put it another way, why are we making the choice between
>> > RS> R_MIPS_PC32 and R_MIPS_EH at assembly time in
>> mips_cfi_emit_expr,
>> > RS> but not in mips_cfi_fix_eh_ref, even though the patch appears to
>> > RS> allow the same two encodings in both casees?
>> >
>> > I'd like to take this approach in the next patch:
>> > 1. Keep the R_MIPS_EH relocation
>> > 2. Let the linker choose the appropriate encoding 3. Clean up the
>> > assembler inconsistencies
>> 
>> That's OK with me, but just to clarify: (3) IMO means that R_MIPS_EH is
>> never associated with a specific encoding in the assembler.  I.e.
>> R_MIPS_EH is purely for an as-yet unknown encoding that is chosen by the
>> linker rather than the assembler or the assembly author.  And AIUI the only
>> place that happens is in .eh_frame_entry.
>> 
>> Perhaps one way of doing that would to have a generic
>> BFD_RELOC_EH_FRAME_HDR_32 that R_MIPS_EH maps to.  Then when
>> emitting the .eh_frame_entry addresses, the assembler unconditionally uses
>> that BFD_RELOC_ rather than a target hook.
>> 
>> What do you plan to do for .ehword?  Since the assembler generates the
>> .eh_frame_entry itself, and since R_MIPS_EH should only be used there
>> (since that's the only place where the linker controls the encoding), I don't
>> think there are any valid uses of an R_MIPS_EH-producing .ehword.
>> 
>> > This would also mean dropping the .forcegpword directive as you
>> suggested.
>> 
>> My problem with .forcegpword was more that the GP base isn't a DSO-wide
>> value, it's only local to the input object.  So IMO dropping it (which
>> is fine) is
>> separate from whether to keep R_MIPS_EH.  I assume if we drop it then the
>> only supported LSDA encoding will be pcrel and pcrel|indirect, at least for
>> now?
>> 
> That's not right.  We need datarel|indirect for the MIPS Linux toolchain.

Why though?  For GNU/Linux the text and data segments have a fixed
offset relative to one another, so I don't see any advantage of
datarel|indirect over pcrel|indirect.  (The current EH scheme gets by
with just pcrel and pcrel|indirect FWIW.)

I.e. both the text segment and the data segment can use pcrel to refer
to any other part of the same executable or DSO.  Then you use |indirect
in cases where the address doesn't bind locally and so might end up
pointing outside the executable or DSO.

datarel is more useful for targets where the text and data segments
can be moved relative to one another and you want to make a relative
reference to data from the text segment.

Like I say, datarel (i.e. gprel) has the significant downside that
on multigot targets like GNU/Linux a single executable or DSO can
have several GP bases, one for each GOT.  I.e. the GP value used
for GP-relative relocations isn't always the same as the value of
the _gp or _GLOBAL_OFFSET_TABLE_ symbols.  That applies both to
the GP base used for R_MIPS_EH in elfxx-mips.c in the original
patch and the GP base used for R_MIPS_GPREL32 (as harnessed by
the revised patch).

> I don't follow why dropping the .forcegpword directive implies that
> datarel|indirect can't be generated.

How else would you generate datarel|indirect LSDA though?  I.e. what
directive and relocation would you use?  The point is that you can't use
R_MIPS_EH because the LSDA encoding is decided by assembly time while
the R_MIPS_EH behaviour is only decided by the linker.

Thanks,
Richard


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]