Frage

I currently have a big git repository that contains many projects, each one in its own subdirectory. I need to split it into individual repositories, each project in its own repo.

I tried git filter-branch --prune-empty --subdirectory-filter PROJECT master

However, many project directories went through several renames in their lives, and git filter-branch does not follow renames, so effectively the extracted repo does not have any history prior to the last rename.

How can I effectively extract a subdirectory from one big git repo, and follow all that directory's renames back into the past?

War es hilfreich?

Lösung

Thanks to @Chronial, I was able to cook a script to massage my git repo according to my needs:

git filter-branch --prune-empty --index-filter '
    # Delete files which are NOT needed
    git ls-files -z | egrep -zv  "^(NAME1|NAME2|NAME3)" | 
        xargs -0 -r git rm --cached -q             
    # Move files to root directory
    git ls-files -s | sed -e "s-\t\(NAME1\|NAME2\|NAME3\)/-\t-" |
        GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
        git update-index --index-info &&
        ( test ! -f "$GIT_INDEX_FILE.new" \
            || mv -f "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE" )
'

Basically what this does is this:

  1. Deletes all files outside of the three directories NAME1, NAME2 or NAME3 that I need (one project was renamed NAME1 -> NAME2 -> NAME3 during its lifetime).

  2. Moves everything inside these three directories to the root of the repository.

  3. I needed to test if "$GIT_INDEX_FILE.new" exists since import of svn into git creates commits without any files (directory-only commits). Needed only if the repo was created with 'git svn clone' initially.

Andere Tipps

I had a very large repository from which I needed to extract a single folder; even --index-filter was predicted to take 8 hours to finish. Here's what I did instead:

  1. Obtain a list of all the past names of the folder. In my case there were only two, old-name and new-name.
  2. For each name:

    $ git checkout master
    $ git checkout -b filter-old-name
    $ git filter-branch --subdirectory-filter old-name
    

    This will give you several disconnected branches, each containing history for one of the names.

  3. The filter-old-name branch should end with the commit which renamed the folder, and the filter-new-name branch should begin with the same commit. (The same applies if there was more than one rename: you'll wind up with an equivalent number of branches, each with a commit shared with the next one along.) One should delete everything and the other should recreate it again. Make sure that these two commits have identical contents; if they don't, the file was modified in addition to being renamed, and you will need to merge the changes. (In my case I didn't have this problem so I don't know how to solve it.)

    An easy way to check this is to try rebasing filter-new-name on top of filter-old-name and then squashing the two commits together: git should complain that this produces an empty commit. (Note that you will want to do this on a spare branch and then delete it: rebasing deletes the Committer information from the commits, thus losing some of the history you want to keep.)

  4. The next step is to graft the two branches together, skipping the two commits which renamed the folder. (Otherwise there will be a weird jump where everything is deleted and recreated.) This involves finding the full SHA (all 40 characters!) of the two commits and putting them into git's info, with the new name branch's commit first, and the old name branch's commit second.

    $ echo $NEW_NAME_SECOND_COMMIT_SHA1 $OLD_NAME_PENULTIMATE_COMMIT_SHA1 >> .git/info/grafts
    

    If you've done this right, git log --graph should now show a line from the end of the new history to the start of the old history.

  5. This graft is currently temporary: it is not yet part of the history, and won't follow along with clones or pushes. To make it permanent:

    $ git filter-branch
    

    This will refilter the branch without trying to make any further changes, making the graft permanent (changing all of the commits in the filter-new-name branch). You should now be able to delete the .git/info/grafts file.

At the end of all of this, you should now have on the filter-new-name branch all of the history from both names for the folder. You can then use this separate repository, or merge it into another one, or whatever you'd like to do with this history.

I don’t think git has a build-in feature for that. You will have to build your own filter. Just use git filter-branch --prune-empty --tree-filter YOURSCRIPT. Your script will then have to identify the correct folder (maybe by the name of a specific file in it or maybe you have a list of all the names this project had in the past), remove everything else and move the folder contents up a level.

If your repo is really big and you don’t have night to run this script, you can achieve the same effect a lot faster with --index-filter, but writing that script will be more complicated. You will have to use the git commands for modifying the index instead of file system modification commands.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top