Question

I have two repository urls, and I want to synchronise them such that they both contain the same thing. In Mercurial, what I'm trying to do would be:

hg pull {repo1}
hg pull {repo2}
hg push -f {repo1}
hg push -f {repo2}

This will result in two heads in both repos (I know it's not common to have two heads, but I'm doing this for synchornisation and it needs to be non-interactive. The heads will be merged manually from one of the repos and then the sync run again).

I'd like to do the same thing in Git. Eg., with no user interaction, get all of the changes into both repos, with multiple branches/heads/whatever to be merged later. I'm trying to do this using urls in the commands, rather than adding remotes(?), as there could be a number of repos involved, and having aliases for them all will just make my script more complicated.

I'm currently cloning the repo using git clone --bar {repo1} however I'm struggling to "update" it. I've tried get fetch {repo1} but that doesn't seem to pull my changes down; git log still doesn't show the changeset that has been added in repo1.

I also tried using --mirror in my push and clone, but that seemed to remote changesets from repo2 that didn't exist locally, whereas I need to keep changes from both repos :/

What's the best way to do this?

Edit: To make it a little clearer what I'm trying to do...

I have two repositories (eg. BitBucket and GitHub) and want people to be able to push to either (ultimately, one will be Git, one will be Mercurial, but let's assume they're both Git for now to simplify things). I need to be able to run a script that will "sync" the two repos in a way that they both contain both sets of changes, and may require merging manually later.

Eventually, this means I can just interact with one of the repos (eg. the Mercurial one), and my script will periodically pull in Git changes which I can merge in, and then they'll be pushed back.

In Mercurial this is trivial! I just pull from both repos, and push with -f/--force to allow pushing multiple heads. Then anybody can clone one of the repos, merge the heads, and push back. I want to know how to do the closest similar thing in Git. It must be 100% non-interactive, and must keep both repos in a state that the process can be repeated infinitely (that means no rewriting history/changing changesets etc).

Was it helpful?

Solution

Git branches do not have "heads" in the Mercurial sense. There is only one thing called HEAD, and it's effectively a symlink to the commit you currently have checked out. In the case of hosted repositories like GitHub, there is no commit checked out—there's just the repository history itself. (Called a "bare" repo.)

The reason for this difference is that Git branch names are completely arbitrary; they don't have to match between copies of a repository, and you can create and destroy them on a whim.[1] Git branches are like Python variable names, which can be shuffled around and stuck to any value as you like; Mercurial branches are like C variables, which refer to fixed preallocated memory locations you then fill with data.

So when you pull in Mercurial, you have two histories for the same branch, because the branch name is a fixed meaningful thing in both repositories. The leaf of each history is a "head", and you'd normally merge them to create a single head.

But in Git, fetching a remote branch doesn't actually affect your branch at all. If you fetch the master branch from origin, it just goes into a branch called origin/master.[2] git pull origin master is just thin sugar for two steps: fetching the remote branch into origin/master, and then merging that other branch into your current branch. But they don't have to have the same name; your branch could be called development or trunk or whatever else. You can pull or merge any other branch into it, and you can push it to any other branch. Git doesn't care.

Which brings me back to your problem: you can't push a "second" branch head to a remote Git repository, because the concept doesn't exist. You could push to branches with mangled names (bitbucket_master?), but as far as I'm aware, you can't update a remote's remotes remotely.

I don't think your plan makes a lot of sense, though, since with unmerged branches exposed to both repositories, you'd either have to merge them both, or you'd merge one and then mirror it on top of the other... in which case you left the second repository in a useless state for no reason.

Is there a reason you can't just do this:

  1. Pick a repository to be canonical—I assume BitBucket. Clone it. It becomes origin.

  2. Add the other repository as a remote called, say, github.

  3. Have a simple script periodically fetch both remotes and attempt to merge the github branch(es) into the origin branches. If the merge fails, abort and send you an email or whatever. If the merge is trivial, push the result to both remotes.

Of course, if you just do all your work on feature branches, this all becomes much less of a problem. :)


[1] It gets even better: you can merge together branches from different repositories that have no history whatsoever in common. I've done this to consolidate projects that were started separatedly; they used different directory structures, so it works fine. GitHub uses a similar trick for its Pages feature: the history of your Pages is stored in a branch called gh-pages that lives in the same repository but has absolutely no history in common with the rest of your project.

[2] This is a white lie. The branch is still called master, but it belongs to the remote called origin, and the slash is syntax for referring to it. The distinction can matter because Git has no qualms about slashes in branch names, so you could have a local branch named origin/master, and that would shadow the remote branch.

OTHER TIPS

For something similar I use this simple code trigerred by webhook in both repositories to sync GitLab and Bitbucket master branch:

git pull origin master
git pull gitlab master
git push origin master
git push gitlab master

It propably is not what you need in question, but it could be helpful for somebody else who needs to sync just one branch.

Here's a tested solution for the issue: http://www.tikalk.com/devops/sync-remote-repositories/

The commands to run:

#!/bin/bash

# REPO_NAME=<repo>.git
# ORIGIN_URL=git@<host>:<project>/$REPO_NAME
# REPO1_URL=git@<host>:<project>/$REPO_NAME

rm -rf $REPO_NAME
git clone --bare $ORIGIN_URL
cd $REPO_NAME
git remote add --mirror=fetch repo1 $REPO1_URL
git fetch origin --tags ; git fetch repo1 --tags
git push origin --all ; git push origin --tags
git push repo1 --all ; git push repo1 --tags

You might not have seen that the fetch did in fact work when you used git clone --mirror --bare, because by default git does not list it's remote branches. You can list them with git branch -a.

I don't quite have the syntax worked out for unnamed remotes, but you could automatically add remotes based on some scheme from the url... in any case, it'll probably work best if you choose some unique and consistent name for each repo, so you can know what changes came from where

However, you could try something like this:

git clone --bare --mirror --origin thing1 {repo1} repo.git
cd repo.git
git fetch thing2 --mirror
git push thing1 --mirror
git push thing2 --mirror

After this was done, thing1 would have all of thing2's branches available to merge at any time, as remote branches. You can list the remote branches with git branch -a.

On github or bitbucket, you will not be able to see these remote branches via the web interfaces, however you can see them if you clone with --mirror, so they do exist.

Try git reset --hard HEAD after git fetch. However, I'm not sure I understand exactly what your goal is. You will need to cd into the separate repository directories before running the fetch, reset, and push commands.

git-repo-sync
It exactly synchronizes pairs of remote Git repositories and intended for constant team development work from both remote sides.
It is like you have two entry points to a single repository and your two remote Git-repositories will be behaving almost like a single repository.

I am the author of git-repo-sync, and my main idea during development was to install, auto-run periodically and forget about existence of this tool.
And it's actually doing its stuff pretty well.

The git-repo-sync has auto conflict solving strategies, different disaster protections and many features. You'd better look at the README of the project.

The only thing is, it doesn't sync Git tags, but that's intentional.

Sorry, I can't help with the Mercurial side of this SO question. But my tool could be helpful for those who are seeking solving of this problem for Git remotes only.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top