Question

We currently use SourceAnywhere Hosted as our version control server. I'm looking to migrate over to GitHub, and would really like to preserve our 8+ year history.

Has anyone else successfully completed this migration and care to share their tools/process?

Now, assuming that this hasn't been done before, I suppose I'm looking at writing a git fast-import script using the SourceAnywhere SDK or command line client. Being new to git, are there any existing scripts or resources you could direct me towards as a starting point?

Was it helpful?

Solution

I've finally cleaned up my project and added it to GitHub. You can find it here: SAWHtoGit.

It does a pretty good job of exporting the history into logical changesets, with a few small limitations:

  • Any file that has been deleted in SourceAnywhere will not be imported into the history, at all, due to limitations in the SourceAnywhere API.
  • Any files that were 'moved' can be imported, as long as you provide a mapping for the old/new directories.

Other than that, it worked well for our purposes and we were able to successfully migrate our code and history to GitHub. I hope it will be useful to others, as well!

OTHER TIPS

The import part is easy:
Once you have extracted a coherent set of files from your initial repo, you can add it to the git repo which will detects any modification/addition/removal.

"Coherent" = a set of files which represents a stable state, like for instance "which compiles": those points in time are usually represented by label, especially in a repo working at the file level like SAW (as opposed to git, which works at the repository level, each revision representing the content of the full repo)

Adding a set of file to git is as simple as:

git --work-tree=/path/to/extracted/file --git-dir=/path/to/git/repo/.git add -A
git --work-tree=/path/to/extracted/file --git-dir=/path/to/git/repo/.git commit -m "new revision from SAW import"

The difficulty is to determine what to import.
I would recommend listing all labels, and use them to get all the projects as in GetProject -label (using the SAW CLI)

Note that each project should be in its own Git repo: that will avoid large bloated repo, which will be hard to clone around, as opposed to a centralized model with SAW, where you can put all your projects in one referential.


The OP Dan comments:

I was able to use the SourceAnywhere COM SDK to write a small utility to extract my history (as best as the SDK would allow) and write out a fast-import script to load it all into git.
While not every intermediary changeset is necessarily "coherent", the end result matches our current state, and we preserved the bulk of our history.

You can look at how other git-over-XYZ implementations work. For example, here's the code for git-svn, and here's the code for git-cvsimport (both in Perl).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top