Question

I've used efficiently rewriting (rebase -i) a lot of history with git to replace file text (Copyright <author> to Copyright <company>) for the entire commit history of a Git repository. It works, but the old commit history is still visible.

In the screenshot below, the 'new' history in the blue branch comes after the unwanted 'old' history in the purple branch. How do I get rid of the 'old' history? (why it doesn't 'merge' into one continuous history beats me!

after <code>git filter-branch</code> command

Was it helpful?

Solution

So that GUI is SourceTree, by Atlassian. And it was correct. The issue was that I had other branches and tags referencing the "old history". So naturally, it sticks around!

Think in terms of Back to the Future :)

So, "you can't rewrite history" is partly correct, it just depends on your perspective.

Doc Brown: Obviously the time continuum has been disrupted, creating a new temporal event sequence resulting in this alternate reality.

Marty: English, Doc!

Doc Brown: Here. Here, let me demonstrate. Let's say that this line represents time. [draws straight line and points to places] Here's the present 1985, the future and the past. Obviously, somewhere in the past the timeline skewed down into this tangent (branch) [draws new line and writes 1985A] creating an alternate 1985 (master). Alternate to you, me, and Einstein, but reality for everyone else. Recognize this? [shows Blast from the Past bag (tag)] It’s the bag the sports book came in; I know because the receipt (tag) was still inside. I found them in the time machine...along with this! (more tags)

Yay! Git is BTTF!

Moral: Delete the references and it will cease to exist!

OTHER TIPS

I'd recommend using the BFG Repo-Cleaner, a simpler, faster alternative to git-filter-branch specifically designed for rewriting files from Git history. One way in which it makes your life easier here is that it actually handles all references by default (all tags, branches, etc) but it's also 10 - 50x faster.

You should carefully follow these steps here: http://rtyley.github.com/bfg-repo-cleaner/#usage - but the core bit is just this: download the BFG's jar (requires Java 6 or above) and run this command:

$ java -jar bfg.jar  --replace-text replacements.txt  my-repo.git

The replacements.txt file should contain all the substitutions you want to do, in a format like this (one entry per line - note the comments shouldn't be included):

pa$$word   # Replace literal string 'pa$$word' - with '***REMOVED***' by default
Copyright Volte==>Copyright MegaCorp     # Replace, specifiying replacement text
regex:Copyright \w+==>Copyright MegaCorp                # Replace, using a regex
regex:Copyright (\d{4}) \w+==>Copyright $1 MegaCorp     # Replace with reference

Your entire repository history will be scanned, and any non-binary files (under 1MB in size) will have the substitutions performed: any matching string (that isn't in your latest commit) will be replaced.

You can't rewrite a git history. This isn't an authority thing, nobody's refusing to implement it, it's something that simply cannot be done. A commit id is a unique name for that specific commit. Any different commit, due to any change anywhere in its contents or its history or its description, has a different id. Your 'master' tag (a branch is just a tag that git uses as an implicit parent and updates when you git checkout it and commit something new) now refers to the new history you made with rebase (edit: err, more likely with filter-branch here). Your 'svn/trunk' tag still refers to the original history. If you want to forget svn/trunk in your repo, git branch -d svn/trunk.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top