Proposal: format GDB Python files with black

Tue Apr 27 07:54:44 GMT 2021

* Simon Marchi <simon.marchi@polymtl.ca> [2021-04-26 14:08:24 -0400]:

> On 2021-04-26 1:50 p.m., Andrew Burgess wrote:
> > So without going more sophisticated as you describe below we'd still
> > be relying on reviewers to either (with enough experience) "just know"
> > that the code is black formatted, or apply the patch, run black, and
> > see if the formatting changes.
> 
> The use of black would be documented somewhere (in the repo, in the
> wiki, or both).  Nevertheless, I don't expect all contributors to know
> about it and do it.
> 
> I also don't expect reviewers to be able to spot code that isn't
> black-compliant.  If you think of checking it or asking about it in a
> review, great.  Otherwise, I don't think we should be too afraid of
> errors slipping in, since they're trivial to fix.
> 
> But to be frank, as a reviewer today, I don't really know what to check
> in the Python code formatting.  Our wiki:
> 
>   https://sourceware.org/gdb/wiki/Internals%20GDB-Python-Coding-Standards
> 
> says that we follow PEP8.  I have no idea how to format PEP8 code, so
> as a result I'm not really checking the formatting of Python code today.
> 
> > Then of course, there's the questions about what happens when
> > different users are running different versions of black - assuming
> > different versions might format things slightly differently.
> 
> Indeed, I should have mentioned this: we would define a version to use,
> just like we do for autoconf/automake.
> 
> The goal of black is that the output won't change just for fun, but it
> could change due to fixed bugs.
> 
> > Another concern I'd have is that unrelated "formatting" changes might
> > creep in.
> > 
> > So, I edit a .py file and either don't run black (naughty), or use
> > version X of black.
> > 
> > Much later, I edit a different part of the same file, now either I run
> > black, or run version Y of black.  My original edit is reformatted
> > slightly differently.
> > 
> > My assumption is that this change (the second patch) should be
> > rejected at review, as the reformatting change (of the earlier code)
> > is unrelated to the work of the second patch.  But I suspect some folk
> > might find it frustrating, if on one had we say run black, but on the
> > other hand we say you need to exclude unrelated chunks.
> 
> This is why I think it's easier to start from a clean state, by
> formatting everything in one shot in the beginning.
> 
> If the issue you describe happens while I am writing a patch, I would
> notice the spurious diff and push an obvious "Format foo.py with
> black" commit and rebase my WIP patch on top.
> 
> If the issue you describe happens while I am revieweing a patch, I would
> probably also push an obvious re-formatting commit (or ask the person to
> do so, if they have push access).  The offending hunk will go away when
> they rebase, which is necessary prior to pushing.
> 
> > I think, in the heterogeneous environments we all develop in,
> > situations like the above will crop up, so it would be important to
> > know what the expectations would be in such a case.
> 
> Indeed.  I hope I was able to show that these situations can be handled
> easily and are not a showstopper.

Thanks for taking the time to reply.

I have no objections to adopting the use of black, my only request
would be that we formally make it a policy that "rogue" re-formatting
hunks, as we discussed above, should be fixed in a separate commit,
and not included in random patches.

For me, one of the great things about working on GDB is the generally
good quality of the commits, and I feel that if commits start
including extra reformatting this would be a step backwards.

> 
> >>
> >> If we want to be a bit more sophisticated, we could check-in a git
> >> commit hook that runs black (if available) to check your formatting as
> >> you are committing, if your commit includes changes to Python files.
> >> That doesn't prevents formatting errors from slipping in the repo, but
> >> it's already a good start (and formatting errors slipping in the repo
> >> are not the end of the world).
> > 
> > Again, this sort of thing works great in a corporate environment where
> > you can guarantee that everyone is (or has no excuse not to be)
> > running the exact same version of every tool.
> 
> Even in corporate environment, you can't assume that!

I disagree, but I suspect the problem is I didn't explain myself well
enough.

Never mind, given the above I think you've answered my questions.

Thanks for your time,
Andrew

> 
> > My concern would be that such a strategy would invite unintended
> > changes to creep in.
> > 
> >>
> >> We don't really have CI right now (especially pre-merge testing), but if
> >> we had, checking the formatting of Python files could be a validation
> >> step.
> > 
> > Agreed.
> > 
> >> We do however, have scripts that run server side on push, so it would be
> >> possible to run black server-side to reject commits that would introduce
> >> formatting errors.
> > 
> > I guess this would solve a lot of the problems I'm concerned about.
> > Once we've "corrected" the formatting of all .py code then nobody
> > could merge bad code.
> 
> Thanks,
> 
> Simon