Integrator workflow, Is fetch-rebase-push safe for remote repos?

https://stackoverflow.com/questions/1274755

16-09-2019
|

Question

I'm managing a git repo using the integrator work flow. In other words, I pull commits from my co-workers, and push them out to the blessed repo.

I'd like to keep the commit history linear for most cases, so is it OK to do a rebase instead of a merge when I integrate changes? Here is an example:

git fetch coworker
git checkout coworker/master
git rebase master
git checkout master
git merge HEAD@{1}
git push

I'm concerned what will happen to the remote repos when they do their next git pull. Will git be able to handle this, or will the coworker repo fail during the pull, now that the commits are in a different order on the origin?

Update: I originally had the example rebase the 'coworker' branch from 'master'. What I intended was the opposite, to put the 'coworker' commits on top of the master. So I updated the example.

Solution

You definitely don't want to do what you suggest, it will rebase the master branch onto your coworker's master. Depending on what your coworker's master was based on you may end up often rewinding the central master.

What you might want to do is the opposite, rebase your coworker's master before merging it into master.

git fetch coworker
git checkout coworker/master
git rebase master
git checkout master
git merge HEAD@{1}
git push

I still wouldn't recommend this, though. Your coworkers will have to resolve how you rebased their changes. Most of the time it's probably trivial and they can throw away their commits in favour of yours, but it's still something that they probably need to manually check.

Personally, I would recommend straight merging of their commits. If you feel that they are based on a too old version of master and the merge will be unnecessarily complex or based on an unjustifiably old commit then get them to rebase their master and refetch. Then at least they know what you are merging and they resolve any conflicts in their code.

Also, I would caution against aiming for unnecessarily linear history. Merging in developers' branches developed in parallel gives you a more true representation of history. If you rebase a developer's commit before merging then you no longer have a commit record that is an accurate representation of exactly the state of the code that that developer fixed and submitted. This may not matter very often but it may happen that two commits interact to produce a bug, but not a merge conflict. If you don't rebase, you get a more accurate (and fairer!) 'blame'.

OTHER TIPS

The vast majority of the vast amount of documentation and tutorials about git make it clear that rebase should be used only on private branches, never something that someone else can see. Under your model I would be very afraid of inexplicable failures or having to repeat work at other replicas. Avoid!

As mentioned in the "A truce in the merge vs. rebase war?" article, (emphasis mine)

Perhaps the worst problem with traditional rebasing is that it prevents collaboration.
Somebody who pulls from a repository before and after a rebasing will experience conflicts because the two histories contradict each other. Thus the standard caveat "don't rebase within a published repository", which can be reworded to "don't collaborate on work that you might later want to rebase".

Even if it "works" because of lacks of conflicts, it can lead to some troubles if you have to solve any non-trivial merge during your rebase:

M0----M1----M2
\
 \
  \
   B1----B2

M0----M1----M2
            \
             \
              \
               B1'---B2'

The SHA-1 of your (previously published) branch being rewritten, your colleagues will have a hard time merging that branch in their environment.

This would be an acceptable workflow for trivial cases. When your coworkers, do a git pull, it's really a git fetch followed by a git merge. Git is really great at doing merges and will be able to resolve simple cases without issue.

However, if you have to do any work to resolve conflicts at the git rebase step, then your coworkers may have to do that work again when they pull. That will happen because your sequence of commits looks a lot different from theirs after the rebase.

If you become comfortable with a nonlinear history, Git will probably be able to manage this workflow better (since that's what it is designed to handle).

I did some simple tests on this work flow. I agree with Charles' post, but wanted to add some other info.

Pros

The workflow will not break the users pulling from your public repo.
It gives you more control over the commits being pulled into your mainline.
It is easier to follow the feature history of the mainline branch. If you have to do a merge commit (standard workflow) to pull multiple changes in, then the merge commit will group modifications of all the new commits into a single commit. This breaks the "one commit one feature" idiom.

Cons

On the repo where you are pulling changes from, the "original" commits will still show up. This will likely add confusion for the contributor, unless they know what you are doing. I guess one way around this is to have the contributor throw away their dev-branch after you pull and rebase it.
If the remote repos don't throw away their dev branches after you rebase, then it makes the master branch history difficult to follow along side the remote branch.
After the rebase, you loose the original authors name on the commit. (Maybe there is a manual way around this though.) This makes it harder to track who commited each change.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow