Question

I have an SVN repository, which has many directories inside. Now I would like to clone this repository, leaving only one particular directory in it. And of course I don't need any revisions related to other directories in this new repository. How can I do it? Thanks.

Was it helpful?

Solution

You will need to create a dump of your repository, and filter out the directories and revisions you want. svndumpfilter is the all-purpose tool for this. See this chapter of the subversion book for an example.

OTHER TIPS

Another possibility is git svn clone, which pulls only the directory of interest. It can then be pushed to a new svn repo. Is it easy? No, but if you don't have access to the server, it's handy. I'm sure there's a better way - but this way makes sure you don't accidentally push to the original svn repo.

    # original svn path
    git svn clone http://server/path/to/clone orig
    # SVN to put just your branch in
    svnadmin create new
    # add one entry, or git svn errors out...
    svn co file:///path/to/new new.wd
    mkdir new.wd/null
    svn add new.wd/null
    svn ci -m"add null" new.wd/null
    # clone our clone
    git clone file:///path/to/orig abc
    cd abc
    # set up svn path to the svn repo
    git svn init file:///path/to/new
    # pull data in (our one commit)
    git svn fetch
    # show the branches
    git branch -a
    # Prep the git repot for the push
    git rebase --onto remotes/git-svn --root master
    # finally push to the new svn repo.
    git svn dcommit

No where could I find out how to do this, so I came up with this. You may need to mess with the svn donf files like so:

    svn_repo="$(pwd)/new"
    user="myusername"
    echo '[/]' >> $svn_repo/conf/authz
    echo "$user = rw" >> $svn_repo/conf/authz
    echo '[users]' >> $svn_repo/conf/passwd
    echo "$user = test" >> $svn_repo/conf/passwd
    echo 'password-db = passwd' >> $svn_repo/conf/svnserve.conf
    echo "svn repo is file://$svn_repo"
    svn co file://$svn_repo svn.wd

In the above example, myusername would have a password of test.

The traditional way to do what you are trying to do is to dump your repository out to a dump file, use svndumpfilter to include or exclude files according to your needs and then to load the filtered dump file into a new repository. This approach works well for simple changes such as removing a couple of files from a repository.

Things get a bit more difficult when the repository includes file moves and copies.

Lets take the simplest example of having a project folder called MyProject that has been renamed at some past time in the history to TheProject. Another file called TheProject/copiedfile.txt was subsequently copied under TheProject from location AnotherLocation. A rudimentary tree structure could look like this.

...
+ AnotherLocation
|--- copiedfile.txt
|--- unwantedfile.txt
+ TheProject
|--- copiedfile.txt
|--- otherfile.txt 
...

You would like TheProject to have its own new repository. So you dump your repository to a file and use svndumpfilter to only include TheProject since this is the name of the project you see in the HEAD revision.

svndumpfilter include /TheProject < input.dump > output.dump

You unfortunately get this error from svndumpfilter

svndumpfilter: E200003: Invalid copy source path '/MyProject'

Thats because TheProject used to be called MyProject and at some revision in the past it was renamed. Since a rename is essentially a delete and copy, svndumpfilter cannot find the source of the copy that creates TheProject and comes up rightly comes up with an error. So we try again with the following command that includes MyProject as well

svndumpfilter include /TheProject /MyProject < input.dump > output.dump

Svndumfilter now comes up with another error.

svndumpfilter: E200003: Invalid copy source path '/AnotherLocation/copiedfile.txt'

Yes, this is because copiedfile.txt was copied from AnotherLocation to TheProject. So we have to include this file as well since otherwise we could not possibly copy it to `TheProject'. Lets try again.

svndumpfilter include /TheProject /MyProject /AnotherLocation/copiedfile.txt < input.dump > output.dump

The operation succeeds! Third time lucky it seems!

Lets try loading our filtered dump file to the repository.

svnadmin create newrepo
svnadmin load newrepo < output.dump

Not so lucky afterall! The following error comes up during loading

* editing path : AnotherLocation/copiedfile.txt ...svnadmin: E160013: File not found: transaction '1-1', path '/AnotherLocation/copiedfile.txt'

Ah! this is because we forgot to include AnotherLocation which is required as it is the parent folder of copiedfile.txt

svndumpfilter include /TheProject /MyProject /AnotherLocation < input.dump > output.dump

Ok this command works and the loading also works. Unfortunately we now included /AnotherLocation/unwantedfile.txt as well. This concludes that using svndumpfilter include does not really work as it does not give us the granularity we are after. We have to do everything using svndumpfilter exclude in an attempt to exclude everything we don't need thus ending up with a repository with the files we need. Suffice to say that is riddled with its own set of problems. For example it's quite easy to exclude files that are actually required in the repository. If people want an example of this I can extend this answer.

There must be a better way. Turns out there is but it's a commercial offering. We developed a tool called Subdivision that specializes in extracting files and folders from a subversion repository. It can also delete (or obliterate) files from an subversion repository as well as split a repository in two parts while guaranteeing that no files are missed out from one of the two repositories. What makes Subdivision shine is the fact that it holds an in-memory view of the whole repository and runs the algorithms required to solve all the problems we experienced in the example above. This means that you get the necessary granularity in extracting the right files only while saving the user time since the operation completes in one pass.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top