Splitting a Subversion repository into multiple repositories
Having multiple projects stored in one Subversion repository is a challenge if you want to move one of the projects to another repository. Also, over time, moves and deletions can bloat the size of your repository with obsolete, unused data. In this article, we will show you how to extract SVN projects to their own repositories, preserving full commit histories.
When Mugo first started many years ago, we had one SVN repository for all projects. It had a structure similar to the following:
/ |---project_1 | |----trunk | |----branches | |----tags |---project_2 | |----trunk | |----branches | |----tags | . | . | . |---project_n |----trunk |----branches |----tags
For various reasons, we needed to move and retire projects. It made more sense to give projects their own repositories so that we could, for example, take advantage of integrated project management systems such as Assembla. Or, we might want to migrate them to Git. Also, since SVN by its very nature keeps a history of all changes made, we didn't want to have to maintain and back up a mega-repository that would continue to grow even after we archived projects.
The task was thus to extract each project folder into its own separate repository. An easy way to do this is to simply create a new repository and add the current code base as revision 1. However, the history of code commits and comments is very important. Thankfully, this is relatively straightforward to preserve using the Subversion tools svnadmin and svndumpfilter.
Here are the steps to extract a project that exists in a sub-folder within an SVN repository, preserving all previous relevant commits:
- Create a dump file of the entire repository using the svnadmin tool on the Subversion server.
svnadmin dump Projects > projects.dump
- Use svndumpfilter with the "include" sub-command to extract only a specific path.
svndumpfilter include client_n/Trunk --drop-empty-revs --renumber-revs < projects.dump > project_n.dump
- The options here are important to understand:
- --drop-empty-revsTo eliminate revisions belonging to a different project; otherwise these are created as empty revisions.
- --renumber-revsTo re-number the revisions to eliminate the gaps created by the --drop-empty-revs parameter.
- Optional: By default, the project will still reside in a "project_1" sub-folder in the new repository. If you run an "svn move" command then the new file paths don't retain histories of the old file paths. If you want to move all of the files one level up to be at the root of the repository, you must modify the path in the dump file. This can be done by hand, but it's easier and safer to use a tool such as sed:
sed -i 's/Node-path: project_n\/trunk/Node-path: trunk/g' project_n.dump sed -i 's/Node-copyfrom-path: project_n\/trunk/Node-copyfrom-path: trunk/g' project_n.dump
- Using this command, you are doing a global substitution for all appearances of "Node-path: project_n\/trunk". If you have such text strings in your file contents that you want preserved, you must tweak the command according to your particular version of sed.
- If you change the path, step 6 regarding the "relocate" command will not work to re-point existing repository checkouts. You would have to do a separate checkout and then merge any differences.
- Create a new repository for the project.
svnadmin create project_n
- Load the filtered repository.
svnadmin load < project_n.dump
- Re-point any existing checkouts to the new repository using svn switch:
svn switch --relocate http://svn.example.com/projects http://svn.example.com/project_n