This is the mail archive of the systemtap@sources.redhat.com mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: x86_64 RIP-relative addressing bug


On Mon, 2005-02-28 at 19:31, Roland McGrath wrote:
> Thanks for that fine elucidation of the issues.  I think you have covered
> the range of attacks available to us very well.  However, I am quite
> skeptical of your conclusions.  
> 
> At first blush, I am pretty turned off by all the "outside" approaches.
> That is, ones that require user-level probe preparation to "get it right"
> in some detailed way...

Agreed.  All these approaches are pretty ugly.  Some might be slightly
more palatable when hidden behind the SystemTAP command interface, but
bare-knuckles kprobes users would see the ugliness.  

> Moreover, I think the layering of functionality we have now is a good
> thing.  That is, kprobes is (on x86) a robust generic facility for
> inserting a probe "at any reasonable instruction" and having it work.
> I think this should be a goal of its own for x86-64 kprobes as well.
> To put it bluntly, I have a pretty strong "just do kprobes right" position.
> The motivations for this view are not strictly within the scope of the current
> systemtap project, but I'll put it out there as a personal priority.

Agreed.

> 
> As to the low-level issues, there are two components: detection, and fixup.
> In brief, I think your assessment of the difficulty is overly pessimistic,
> and I offer the supposition that we can in fact do the "best" solution.
> The problem is intricate but not vast, and I think the combination of the
> robustness goal I just mentioned and the sheer inelegance of the interfaces
> required for the avoid-the-problem approaches, obligates us to give it the
> old college try before falling over ourselves to avoid thinking about it.

I can go along with the idea of implementing the analysis piece in the
kernel, and trying something more devious only if that proves
unacceptable.  Borrowing code and/or techniques from objdump (it's GPL)
should make this effort a lot easier.  I still think that the bug fix we
end up with will be more code than the rest of the x86_64 kprobes port,
but if we can sell it, great.

> 
> Detection is the key element, really.  My asserted goal of a robust, simple
> facility, not intrinsically requiring arcane knowledge of the particular
> instruction being instrumented, makes detection mandatory: if you ask
> kprobes to insert a probe at an instruction boundary and it tells you it
> did so, that instruction ought to get executed with the proper effects.
> Detection alone, with kprobes simply refusing to insert a probe on a
> RIP-relative instruction, would be a marked improvement on the status quo.
> 
> I really think the notion that it would take "hundreds of lines" of code to
> decode x86-64 instructions adequately to identify RIP-relative ones,
> overstates the complexity of the problem.  The encoding is hairy, but it's
> not that hairy.  There is plenty of experience with decoding it.  The
> intimate knowledge required for doing so is in the book in front of me.  In
> considering this complexity, it's important to recognize that it needn't be
> bulletproof (though I am claiming that it's not desperately hard to make it
> so).  We're only concerned with the instructions the compiler produces and
> that really appear in kernel code.  We can do the objdump|grep on kernel
> text to identify every RIP-relative instruction, and point the detector
> code at each one to verify that it catches them all.  We can even use
> objdump to tell us the insn boundaries, and then point it at every other
> instruction to verify it has no false positives.

I like this idea.

> Whether it's as hard as
> you suspect it is, or as doable as I suppose it is, if we have some code
> that we think does it, we can certainly achieve confidence that it does or
> doesn't do it adequately for the needs we can envisage.
> 
> Given detection, we come to fixup (instruction adjustment).  There is only
> one form of the RIP-relative addressing mode, which uses a signed 32-bit
> displacement.  The only issue that arises is if the distance from the
> instruction copy's location to the target address exceeds 2GB.  Rewriting
> the instruction to use a precomputed 64-bit address instead is between
> difficult and impossible (literally, depending on the instruction); in some
> cases it would have to be rewritten to use a scratch register, with the
> attendant hassles of that.

Yeah, especially for instructions that affect eflags.

> It's better if you can just locate the scratch
> area for instruction copies somewhere +/-2GB from the code into which
> probes are being inserted.

Good idea in general, but I don't know enough about VM to know how
tricky this could get.  Consider just giving up if our vmalloc-ed
instructions page is more than 2G away from our instruction.

> Currently x86-64 kprobes uses vmalloc space for
> the instruction copies, which is far away from the region containing the
> kernel's code.  However, the kernel code and all loaded modules' code is
> all put within a region smaller than the 2GB cutoff.  So, kprobes could try
> to find free pages within that range to allocate and make executable for
> this purpose.  Another idea is to take advantage of the fact that modules
> are always loaded into this close region, and require a module registering
> a probe to provide some scratch space in its own executable code segment.
> The scratch space would need to be written at probe insertion, exactly the
> same time you are modifying the text anyway.

The scratch space would have to be mapped executable, of course. 
Perhaps provide a separate module to allocate the scratch space?  A
kprobes user would need to know that his own module(s) depend on that
module, I guess, but he wouldn't have to alter how he codes his
module(s).

I agree that once we've done the analysis code, the fixup code shouldn't
be all that hard to add; the aforementioned VM fussing would probably be
the trickiest part.

> 
> I hope these thoughts give you some encouragement that we can a very
> satisfying result without any really monumental effort.

Thanks for all the ideas.  Plainly, my suggested approaches assume that
we want to avoid putting the instruction-analysis code in the kernel;
but that assumption is not necessarily valid.

> 
> 
> Thanks,
> Roland
> 

Jim


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]