Pergunta

A team simultaneously made several refactorings (to raise system genericity) to the same project with some overlaps (yes, unfortunately, more like "big bang"). Code is spanning representation, application, domain layers and has pretty high code coverage. There are N branches waiting to be merged into the master. Some refactorings were more into representation, but most of them were in the domain, with some overlaps (ok, there are changes to model's central and most-used entities: two entities merged, one broken down into two plus completely new one added).

What could be good order to merge the branches? Should the team start from lower tiers up or the other way around? Or maybe branches with less "footprint" should be merged first?

Gut feeling is to merge lower-tier-focused branches first, then go up. What other things are there to consider to bring redundant work to the minimum?

If it matters, most feature branches were derived from master at almost the same time and kept readily mergeable to the master.

Surprisingly, I have not found any theories or good reflections on the subject. I've tried to ask the question in a more general way in hope that the answer will also contian more generic heuristics and list of relevant considerations, so it will be more useful than just specific situation.

De facto, granularity of commits differs between developers. Sometimes, commit is done for broken code to make others take a closer look.

Foi útil?

Solução

You merge in the order that makes the merges the easiest. There is no general heuristic for this because it's highly dependent upon the code changes in question. Most of the time, your "gut feel" is pretty close to optimal. Consider that you can test different merge orders locally before doing it officially for your central repo.

Also, merging becomes much more difficult the longer you go without doing it. If your changes are too broken to merge into master, you should at least be frequently merging master into your refactoring branches, and frequently merging your refactoring branches into each other. That way, you're resolving conflicts a little bit at a time instead of all at once at the end.

Merging branches into each other can take a couple different forms. The most common way I do it is I will occasionally merge a colleague's branch into mine that I know has overlapping changes. This has a few potential outcomes:

  • My colleague's code is in a broken state because the feature or refactor is half-finished, so I back out the merge and try again later.
  • There are merge conflicts that I can resolve by changing my code, so I do it.
  • There are merge conflicts that require a conversation with my colleague, so I have one.

If my colleague merges to master first, then my PR will exclude changes merged in from that branch automatically. If I need a pull request first, I can rebase them out.

The other way to merge branches into each other that would probably work better in your situation would be to create a temporary refactor branch that can be used for continuous integration of the refactoring changes, then do one big pull request to merge refactor into master. That way, developers are reviewing small pull requests against the refactor branch.

The downside to merging branches into each other is you tend to introduce dependencies on each other, so you either have to take care to avoid that, or accept it and merge them close together. However, not having to take all changes via the master branch in the central repo is one of the main strengths of git. Make sure not to overlook it.

Outras dicas

Merge in order of completion. Otherwise people are hanging around waiting for stuff.

Lets say you have two simultaneously complete branches to merge

branch 1 is a small change, the Add function is now called Addition and take an extra parameter (so the ide cant automatically make the change)

branch 2 is a large change to several business logic functions which use the Add function in several places.

So you have three tasks.

  • Merge B1
  • Merge B2
  • Change All the Add functions affected by B2 to Addition and add the extra parameter

Which ever order you choose B1 or B2 first you have the same third task. The only difference is how the task is presented.

But even this is a false choice, Say you merge B2 first and are presented with moving all the B2 code into B1. You could ignore that presentation if you wanted and just copy B2 and apply the name change to it. The result is the same code.

So you have a master branch, and created say four branches, and each branch performed its own refactoring. And each branch is on its own fine.

One of the four branch owners is lucky, and merges his or her changes with master, without conflict. Code reviews are obviously done, tests are run, so the master branch is in a good state again, with a refactoring done.

The other three branches now have a problem. They are not based on the current master branch anymore. They cannot be merged without conflict. If you notice that your branch cannot be merged without conflict, then you cannot merge into master and fix problems as you go, that is just suicidal and impossible to review. Instead, you merge master into your branch, and take responsibility for getting it to work. The other two branches do the same. One of three is lucky, and merges his or her changes into master, without conflict (because they merged master into their own branch).

And that gets repeated twice more.

You will have noticed that this is all an awful lot of work. That's because having multiple developers modify the same files independently is just a bad idea which will create a merging nightmare. The better way is to do one refactoring, merge into master, do a second refactoring and so on. Or make absolutely clear which files each developer is allowed to touch before each one starts his refactoring, and not allow two developers to refactor the same files.

Licenciado em: CC-BY-SA com atribuição
scroll top