Question

For some years now, I'm waiting for Subversion to feature a "delete permanently" (obliterate) function. I hesitate to make the transition to Subversion (coming from Visual SourceSafe :p), because I think this is an essential feature, as otherwise I'd expect the repository to grow unstopably. However, for one reason or the other, the feature gets postponed over and over again. So I begin wondering if there is some other feature or workaround which makes the obliterate function dispensable.

What do you do when you want to shrink the SVN central repository?

Example 1: I check in a large third party library, and after a few weeks I realize it is not suited for my needs. I don't want that to store and backup that large amount of data forever.

Example 2: I have 10 versions of 10 big third party libraries in the repository, but I only use the latest versions.

Example 3: I accidentally checked in sensitive information (as suggested by John).

Example 4: I accidentally checked in some big files that were never meant to be put in the repository.

Was it helpful?

Solution

There is a fair amount of discussion of svn obliterate on the problem ticket at the Apache Subversion site, most of it ending about 2008. There seems to be general agreement that it's a good capability to have, although its use should be rare.

There are two main reasons to want it.

First, checking in confidential information can be a problem. Leaving it in there, deleted, is not necessarily an option, depending on the level of confidentiality and exposure of the repository.

Second, checking in a large amount of stuff that shouldn't be checked in can drastically increase the size of the repository. Disk space is generally cheap nowadays, but it isn't unlimited, and there are other ways file space can matter. If it's necessary to send a repository over a net connection, that's extra time which may or may not be important. There can be real advantages to being able to burn a CD-ROM or DVD-ROM that contains the whole repository.

Therefore, it's a useful capability which is currently done by dumping, filtering, and reloading the repository. This is error-prone according to reports I've seen, can be slow, and requires shutting down the repository.

Obviously, it's not a high-priority feature for the Subversion team, given that what it's needed for quite a few years is somebody to do the work to come up with a design and implement it. After all, it should be done very rarely, and there is a workaround. However, anybody who wants to do a whole lot of work on Subversion could provide a patch that would (if good enough quality) probably be implemented.

OTHER TIPS

It violates the meaning of source control.
Source control is all about being able to restore a previous state. If you delete a file permanently you won't be able to.

OTOH i do not know VSS so i might have misunderstood "delete permanently"

The obvious reason against it is because the developers think it will on balance make SVN worse - the happiness you feel at being able to prune un-needed stuff will be vastly dwarfed by your anger when you accidentally obliterate something and your /trunk goes missing.

FogBugz has exactly the same behavior, and in their case it's entirely by design I believe, protecting users from themselves.

Obliterate violates the version control principles that you'd want to have. Either you wouldn't save any space, or previous tags would become broken. You would not be able to go back to a true previous version if you had obliterated any files.

As for your comment about the repository growing... Any repository will grow linearly with the size of changes over time. That's the whole point of a source control system. If you don't need to be able to track prior versions, then why not just stick to a shared folder somewhere?

Quoting Subversion Obliterate, the forgotten feature, there are three components to the question, the problem, the reason and the solution. Since you started with the question to the solution, I'll start with that.

Solution

As you noticed, there is no great solution. Especially if you are dealing with a big corporate repository, since the solution becomes harder the bigger the repo gets. There's a feature called dump / filter through which you can clean out your repo of stuff you don't want, but it is not that easy to use, not fast and not reliant.

There has been a small effort (follow the thread) on the svn team to get an obliterate feature in there after 2008, but the effort died a silent death.

The problem

The article I mentioned at the start actually has a good list of use cases where one would need an obliterate command and in the 516 issue thread the developers actually acknowledged its merit.

Alas, it seems too late for that now; the real reason it was never added later, was that it now nigh impossible to implement it, as it hooks into the code at the most fundamental level (also see small effort link under Solution).

From the FAQ entry:

Revisions are immutable trees which build upon one another. Removing a revision from history would cause a domino effect, creating chaos in all subsequent revisions and possibly invalidating all working copies.

The reason

The problem is that originally the obliterate feature was dismissed as it was not conform the principle of true version control.

Again from the FAQ entry:

How do I completely remove a file from the repository’s history? There are special cases where you might want to destroy all evidence of a file or commit. (Perhaps somebody accidentally committed a confidential document.) This isn’t so easy, because Subversion is deliberately designed to never lose information.

However

I've worked with SVN for a lot of clients now with larger teams and larger project and basically never had a real issue. Yes the use cases mentioned warrant an obliterate feature, but so far I'm not convinced that this is a problem that you have over and over again everywhere you go. Ofcourse, the nature of this particular problem is that you only have to make a mistake once and it can't be undone properly.

Because removing data from the repository breaks the basic premise of source control, that being that it is possible to reproduce all previous states and changes to the source tree. If you want to obliterate something from version control, you're probably "Doing It Wrong", as they say.

I use various version control systems for about 15 years now and never needed a feature like this.

I wonder what the reasons are that you want that feature:

  • disc space? Hard to believe considering the price of disc space
  • commited a password to version control? Well that will teach you. Go and change the password
  • speed of the repository? Doesn't sound so, but if I would consider a completely different system with supposedly better performance.

It is possible to reduce the size of a SVN repository by doing a dump and load. Essentially if you say that you never want to revert to something more than a couple years old it is possible to dump the repository, filter based on time, then reload the dump. Wanting to get rid of a single file due to size is probably an indication that the file didn't really belong in a source control system in the first place.

There is some scripting which helps you obliterate data. Follow this mailing list thread for more info.

It's a hard way to do it as the essence of version control is not losing data, as opposed to deleting it permanently. But if you prune once a year or something like that it can be done.

The entire point of source control is to have a complete history of what your repository looks like. The obliterate command defeats this purpose of source control, and it's a misfeature in all version control systems that have it.

SVN has cheap copying and cheap branching that doesn't require a full copy of the file--just the changed bits. Its central repository is usually very manageable in size, making this misfeature unnecessary.

Obliterate is not an essential feature of Subversion, because it actually breaks the basic principles of version control (which is: to record all history).

And it isn't an essential feature because there are workaround to get this done anyway (using svnadmin and filtering).

Also, the feature is currently heavily worked on. See this post for details.

Last I checked it was intended as an ADMIN feature, and the admin can already dump/filter/broken_workaround and remove history anyway. In regard to the audit trail, this doesn't change the current citation. It would make it less horrible, if something absolutely must removed.

Svnadmin obliterate is one of the most requested features, the dev's finally admit it should exist (finally! after 8 years!!!). And the publicity of it not existing, is chasing users away from SVN.

Unfortunately i had to learn about this "missing feature" the hard way. Since when is basic functionality a feature? New users are starting to hear about this and avoid SVN. As for me, I now use Git.

Don't like my opinion? Linus referred to the SVN developers as morons, and the whole centralized system flawed. I trust Linus as a true expert, and specifically He knows about source.

What I do - not use subversion. Sorry.

They (the developers) obvoiously don't agree with your assessment of that being a critical feature. Did not stop the company I work at at the moment to use it ;) I personaly rule out subversion for this exact reason.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top