How to make better code reviews when pull requests are large?

https://softwareengineering.stackexchange.com/questions/381343

code-reviews

15-02-2021
|

Frage

Disclaimer: There are some similar questions, but I didn't find any which touch specifically the problems you face while reviewing a large pull request.

Problem

I feel my code reviews could be done in a better way. I'm particularly talking about big code reviews with many changes across 20+ files.

It's pretty simple to catch obvious local code problems. Understanding whether the code meets business criteria is a different story though.

I have troubles following the thought process of the code author. It's pretty hard when the changes are numerous and spread across multiple files. I try to focus on the groups of files related to a particular piece of changes. Then review the groups one by one. Unfortunately the tool I use (Atlassian Bitbucket) is not very helpful. Whenever I visit a file, it gets marked as seen, even though it often turns out not to be related to the currently examined piece of changes. Not to mention that some files should be visited multiple times and their changes reviewed piece-by-piece. Also coming back to relevant files when you follow a bad path isn't easy.

Possible solutions, and why they don't work for me

Reviewing a pull request by commits often solves the size problems, but I don't like it since I'll frequently be looking at outdated changes.

Of course, creating smaller pull requests seems like a remedy, but it is what it is, sometimes you get a large pull request and it has to be reviewed.

You can also ignore logical aspect of the code as a whole, but it seems pretty risky, particularly when the code comes from an inexperienced programmer.

Using a better tool could be helpful, but I didn't find one.

Questions

Do you have similar problems with your code reviews? How do you face them?
Maybe you have better tools?

Lösung

We had these problems and asking the question below has been working well for us:

Does the PR do one thing that can be merged and can be independently tested?

We try to break PRs by single responsibility (SR). After initial push back folks were surprised to find that even something small albeit single can be large.

The SR makes it really easy to review and also disseminates knowledge of the expected implementation.

This also allows for incremental refactors as more is added and PR turnaround time is drastically reduced!

I’d suggest breaking them up by SR if possible and see if it works for you. Takes some practice to do it that way.

Andere Tipps

Sometimes you can't avoid large pull requests - but you can be discerning as to who has what responsibility.

I treat pull requests as persuasive arguments. The Author is trying to convince me that the code should look and work this way.

As with any argument it should have a single clear idea. Its either:

a refactor,
an optimisation,
or new functionality.

If they are not being clear, then there is a pretty good chance that they do not understand it themselves. Open up the dialogue and help them break their argument down into its sub-arguments. If need be, its perfectly alright - even beneficial for them to recreate those commits, and offer more comprehensible and direct pull requests.

There will still be large pull requests, but with a clear argument its much easier to see what does not fit.

As for tooling, it is dependent on your organisation and process. BitBucket is a decent tool, whether its better or not depends on everything from your budget, hardware, requirements, through to your pre-existing processes, business rules, and various software adaptions you've already made to accommodate BitBucket. I'd start by looking through the documentation to see if the behaviour can be configured, maybe throwing out to the plugin community (or joining it by making a plugin to do that).

Don't review the complete pull request, but every commit. You'll acquire better understanding of the pull request anyway by doing it this way.¹

If commits are small, doing such review shouldn't be a problem. You follow the changes one by one through the commits, and you end up getting the full picture. There are drawbacks, such as that you'll sometimes review changes which will be undone a few commits later, but this shouldn't be a great deal.

If commits are large, then you should discuss it with the person who did those commits. Explain him why large commits are problematic. Listen to the other person's arguments about why he commits changes rarely: you may learn, for example, that he avoids doing commits because pre-commit hooks take too long, in which case this problem should be solved first.

Reviewing a pull request by commits often solves the size problems, but I don't like it since I'll frequently be looking at outdated changes.

You do, but this is a minor issue, and you should waste much less time reviewing code which will be undone a few commits later than the time you waste trying to figure out why the code was changed when reviewing a single file.

“Frequently” is vague, but if you find yourself spending too much time reviewing code which didn't found its way to the final revision of the pull request, you may use two techniques:

Go quickly through all the commits once, just reading the commit messages. This way, when studying a particular commit, you may remember that a commit message somewhere later said that the change you're looking at was reverted.
Have a side-by-side view of the latest version of the file (although in many cases, for large commits, (1) the files could be radically different and (2) the files may be renamed or the large blocks of code may be moved elsewhere, making it difficult to find the matching file).
Either ask committers to squash commits when it makes sense, or have a specific commit message conventions where one commit undoes a part of another one, and do your review taking in account several following commits.

^{¹ For instance, imagine you're looking at a file where some variable has been renamed. What does it tell you? How should you review this change? If you were reviewing it commit by commit, however, you would see that a single commit renamed the same variable in twenty files, and the comment indicates that the name was changed in order to use the proper business term. This change makes perfect sense and is possible to review.}

Work out what you are trying to achieve with reviews of pull requests and see if there is an alternative.

For instance you might want to

Ensure standards are followed
Check functionality is correct
Make sure more than one person understands the code
Train juniors

etc etc.

Some of these might be better served by other things, and even just understanding the reasons allows you to limit the scope of your checks.

For example if I'm checking every line so I can suggest and discus changes for training, then I can skip that on a large PRs done by seniors

If I need to understand the code, maybe do pair programming on large features and limit the code review to a standards check.

You don't need to check all the things on every PR as long as you have a layered approach to QA

Lizenziert unter: CC-BY-SA mit Zuschreibung

Nicht verbunden mit softwareengineering.stackexchange