SVN woes

I recently found out that our repository holding the code of multiple projects for the past 8+ years wasn’t being backed up properly. What’s worse, the reason it isn’t being backed up properly isn’t because we have no backup in place, no, it is because the repo is corrupt. HOORAY!

This revelation has led me on a long, strange trip, and I have tried many things so far in an attempt to save some of our history (though it is not looking good). One of the interesting things I read was an idea to use svnsync:

Another approach I forgot to recommend before is to setup a user in the original repository that has read permission only to your branch, and then use that user to create a mirror of the repository using svnsync. svnsync will honor the permissions in your access file, and will simply omit the stuff that user can’t read, leaving you with a repository that has a bunch of empty revisions but contains all the changes made to your branch only.

I found a great post on using svnsync and had a go. While this is a pretty slick trick, it unfortunately did not work. You, long ago when the repo was corrupted, I was tasked with “fixing” the repo. My “fix” allowed us to keep on using the repo, but it effectively made the repo worthless since my “fix” was to remove the corrupt db rev file – not a good idea…

Anyway, now we are stuck with a repo that is full of work and no way to export it easily with history. Luckily 3 years in code is like dog years, so much of the code from before the corruption occurred is not really used. My latest trick to attempt is dumping a range after the corruption, then use svndumpfilter to try and save history for relevant projects. This post gives a nice outline of the steps involved to perform a task like this.

Unforunately, some stuff is just going to have to get added to the new repo and we are going to lose history, but we can keep the current repo running in a read-only state for that. In the end, it turns out this is all for the better – we made the mistake of putting binaries into the repo, and it is out of control at over 14GB. Time to prune that sucker…