Archive

Posts Tagged ‘subversion’

Migrating to Distributed Version Control

November 29, 2010 3 comments

A few weeks ago I migrated two major projects to distributed version control systems (DVCS), leaving only one project in Subversion, the one hosted on Savannah. As you can read in my prior posts, I have resisted switching over to DVCS. However, recently I’ve understood the benefits propounded by DVCS adherents, and I’ve found that it has more features than most tutorials let on.

Why Did I Resist?

I resisted DVCS so strongly for a few reasons:

  1. Most arguments for DVCS I encountered were actually anti-Subversion arguments; much of them based on incorrect information about Subversion and CVS
  2. Much of what I read sounded like knee-jerk trendiness: it sounded like people were doing it just because Linus Torvalds says Subversion is stupid
  3. I had an important project (my dissertation!) in Subversion, managed with Trac. I didn’t want to lose all that history by doing a crappy conversion.

When the anti-Subversion arguments didn’t hold up, I ignored them. I thought maybe my working conditions were just different or other people just weren’t reading the manual. Those are still possibilities, but the harder thing to examine was my second reason for dismissal: I assumed that anyone who said these things was a total newbie, who had just been told that DVCS was better. I’ve talked about object-oriented programming proponents often just sound inexperienced with programming. I figured the same was true of DVCS proponents.

However, two things happened that really changed my mind. The first was that I’ve realized that the most annoying thing about somebody questioning my decisions is the feeling that they think my decision is poorly considered when it is deliberate, careful and took me weeks of preparation. It’s very easy to take that attitude with people online: when I don’t hear or see people, I don’t have that mirror held up to me. It’s very easy to just brush something off and say that the other person “just isn’t thinking about it.” Realizing how much that pisses me off when people take that attitude with me, I’ve thought a little more about how I consider peoples’ attitudes online.

Many experience hackers have switched

The second thing was realizing that people whose opinions I know I can value, people who definitely have done their homework, have switched major projects to DVCS. Emacs, my favorite piece of software that I am using right now to right this, is kept in Bazaar now. I know the people who made that decision were doing their homework, not going by knee-jerk reaction, certainly not just to copy Linus Torvalds. Bazaar is also part of the GNU Project.

What about my revisions?

svn2bzr answered my third concern. svn2bzr is a featureful-enough tool that will create Bazaar branches or repositories from SVN repository dumps. It’s really freakin’ easy to create whatever configuration you want:

 > python ~/.bazaar/plugins/svn2bzr/svn2bzr.py --prefix=subdir svndump newrepo

This will create a new Bazaar repository in the directory `newrepo’ that contains all the revisions in the subdirectory `subdir’ of the svn repository. This is where Bazaar’s concept of repositories shows its difference.

In a Bazaar repository you can have many branches beneath the repository in the filesystem, and you import a branch by branching into a subdirectory. I did’t get this for a few weeks, so let me give you an example. Suppose I have a branch called `branch’ located at `~/Public/src/branch’ and a repository called `repo’:

  > cd repo
  > bzr branch ~/Public/src/branch here

That creates a branch within the repository called `here’. Now I can create other branches, merge them, etc. The only tricky thing about getting my revisions into a place where Trac could use them was that I needed a repository hosted on HTTP. Then I used the TracBzr plugin to add the repository to Trac. I realized that changeset links are only used in Trac tickets, and since I had so few of those referencing current revisions, changes in the revision numbers wouldn’t matter that much.

Features of DVCS

I heard many, many anti-Subversion arguments and some really bogus arguments for DVCS. People have said “you can’t merge,” “you can’t make branches,” “Subversion causes brain damage” and on and on. The bogus pro-arguments I heard were that you can commit without a network connection, “forking is fundamental,” and that DVCS is “modern.” Answering these arguments is simple: committing without a network connection is not a big deal. On the other hand updating without a network connection is impossible, and it’s a situation I’ve found myself in more often, especially working with a laptop, instead of just two workstations. This is where DVCS was nice. Updating is a bigger problem than committing.

As to “you can’t merge” and “you can’t make branches,” we all know that’s bologna. However, what you can do much better with DVCS systems like git and Bazaar is edit directory structure and rename files. This is a huge advantage of DVCS systems. Bazaar, for instance, totally keeps track of all renames and copies in its history. Subversion, on the other hand, does renames with a DELETE operation and an ADD operation. Not so smooth. A good way to do get something better than CVS, but not the best.

Furthermore, DVCS systems are very good at merging. That doesn’t mean you can’t merge with Subversion — I’ve been doing that for years. However, merging between two branches in Bazaar is much simpler than merging in Subversion. I don’t have to read the help when I’m merging with Bazaar; merging with Subversion is not hard, but it’s not as simple. Simplicity is the name of the game, baby.

A Stupid Git Realization

I had tried using git before and didn’t enjoy it. I’m glad to say I was using it wrong. I had tried using it to manage my webpages, but whenever I pushed my local changes to my remote webpage tree on UNC’s servers, I would get messages about not updating the local tree and stuff like that. It was just confusing. It didn’t really make sense. I wasn’t interested in trying git again, hence using Bazaar for some new projects.

I had a weird realization one night: I was working with the git tree of Guile, and someone on irc had told me that the most updated git source had a known problem. I didn’t want to go get the tarball for Guile 1.9-13, so I thought “Wait, I have the git tree, so I should be able to generate whatever release version I want. How do I do that?”

  > git tag -l
  > git checkout release_1-9-13

and there I had it. Wow! That is cool.

I also followed a simple tutorial to get my webpages working with a hook that would update the local tree (the one served as my homepage) every time.
It seems a simple idea: make a repository in a different directory,
and check out to it every time I push to that repository. Why hadn’t
that occurred to me before? Conversion from SVN to git was insanely simple:

> sudo yum install git-svn
> git svn clone http://path/to/repo webgit

Conclusions

I think I’m done with Subversion. DVCS, at least git and Bazaar, can do a hell of a lot and I really like their features. I wouldn’t mind using Subversion for an existing project, but I think I’m not going to start any new projects with it. I’m also going to take it easy on people who disagree with me online. I’ve seen that at least some of them were speaking from the same position I hope to.

The “D” in DVCS stands for “Different People”

September 6, 2010 4 comments

Someone on Stack Overflow disagreed with me about using centralized version control for a solo project. As I predicted, of course; as I said in my last post, DVCS is a fad and it will have many converts who support it in a knee-jerk fashion. I think the person who disagreed may just be misinformed, or not thinking about this hard enough. I may not have made things clear in my last post, however, about why centralized version control makes more sense for a solo developer. Also, I admit that it’s paradoxical that someone would use centralized version control for a solo project. However, as I’ll show, centralized version control does make more sense (not only that, but I’m not the only one who thinks so).

Consider this: if you are a solo developer working on the same machine all the time (e.g. a laptop), a DVCS repository in your current working directory and a Subversion repository in your home directory are practically the same. The repository is always online and you can always make commits. This is what makes the “commit while offline” argument of DVCS proponents so weak: for certain situations you can do that with Subversion, or most of the time you won’t need to.

Now consider that if you are a solo developer working on multiple machines, DVCS only creates an extra step in your development. I always work on my big ol’ workstation during the day. If I were using git or hg for my most-frequently-worked-on projects, I would need to remember to push my changes from one repository to the other. With Subversion I just don’t have that step. All other steps are identical with the two workflows.

All through those above arguments, I am only one developer, and this is the crux of the idea. I think people often confuse the situation of multiple computers with that of multiple people. In the former argument, I posed that with one computer there is no difference between using file URLs and using a DVCS repository in the current working directory. However, there would be a difference if there were multiple developers. Then you have to configure access to a single machine (either for cloning or checking out) for multiple users; this is when things get more complex.

The “D” in DVCS stands for “Different People.” What I mean by that is that the “forking is fundamental” argument posed by DVCS proponents doesn’t apply when there’s a single developer or author working on a project. If by myself I want to create branches and work on different features, that is perfectly easy to do with Subversion, and so is merging. What distributed version control is for different people maintaining different features and seeing how they work. I completely reject the idea that centralized version control is obsolete and I will keep recommending it to solo developers.

Back up your Subversion and Mercurial repositories

June 28, 2010 1 comment

This morning I cooked up two scripts to back up my VCS repositories. I have two Subversion repositories locally, one for my main research project and another for my homepage. I have three hg repositories at the moment, one for local shell scripts (like the ones I was writing), one for my configuration files and another for an extended book project I’m working on.

The svn-oriented script takes either command-line arguments or reads repository paths from a file (~/.svnbk), each one on a single line. It makes (not “takes”) an svnadmin dump to stdout, which is compressed with the best available compression program, in the specified directory.

#!/bin/sh

# svnbk.sh: A tool for backing up my SVN repositories
#
# Copyright (C) 2010 Joel James Adamson
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 3, or (at your option)
# any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
# 02110-1301, USA.

# Usage:
#  svnbk.sh [repositories] bkupdir
#
# If `repositories' is empty then svnbk.sh will look in ~/.svnbk for a
# list of repositories to back up; repositories will be dumped to
# compressed files in the directory `bkupdir'

# If available, use xz compression; otherwise use bzip2; if
# unavailable use gzip
XZ=$(which xz)
BZ2=$(which bzip2)
GZIP=$(which gzip)

if [ -x $XZ ]; then
    ZIP="$XZ"
    EXT=".xz"
elif [ -x $BZ2 ]; then
    ZIP="$BZ2"
    EXT=".bz2"
elif [ -x $GZIP ]; then
    ZIP="$GZIP"
    EXT=".gz"
else
    printf "Compression unavailable; dumping to uncompressed dump files\n";
    ZIP="none"
    EXT=""
fi

# Command-line variables
declare -a argv
argv=("$@")
# number of command-line args
argc="${#argv[@]}"
LAST=$(($argc - 1))
BKDIR=${argv[$LAST]}
# now to get a list of repositories from the command-line, we copy
# argv, then destroy the last element:
declare -a REPOS
REPOS=(${argv[@]})
unset REPOS[$LAST]
if [[ -z $REPOS ]]; then
    while read repo ; do
	FILE="$BKDIR/svn_$(basename $repo).dump$EXT"
	svnadmin dump $repo | $ZIP - > $FILE
    done  $FILE
    done
fi

The Mercurial backup requires you to initialize remote repositories, which is easy if you’re backing up to a mounted CIFS or NFS partition. Unfortunately, hg init does not create the directories on the CIFS partition I’m using, so I used the following:

joel@cifshost > mkdir hgbak
joel@cifshost > cd hgbak
joel@cifshost > mkdir bin
joel@cifshost > cd bin
joel@cifshost > hg init

After that the hg push should work just fine. The config file for this one has the local repository followed by the remote repository on each line. It does not use any compression:

#!/bin/sh
#
# Copyright (C) 2010 Joel J. Adamson
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 3, or (at your option)
# any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
# 02110-1301, USA.

# hgbk.sh: A tool for backing up my hg repositories
#
# Usage:
#  hgbk.sh [repositories] bkupdir
#
# Read ~/.hgbk for a list of repositories to back up; the syntax of
# the file specifies
#
# [local repository] [remote (backup) repository]

# now to get a list of repositories from the command-line, we copy
# argv, then destroy the last element:
while read -a repo ; do
    cd ${repo[0]}		# full pathname!
    hg push ${repo[1]}
done < $HOME/.hgbk

Pretty easy, but saves a lot of time. The only non-automated part is the initial remote repository. I then activate them with the following script, placed in ~/.cron.daily/:

#!/bin/sh

ERRLOG=$(mktemp)
/home/joel/bin/svnbk.sh /home/joel/bioark 2>| $ERRLOG
SVNRC=$?
/home/joel/bin/hgbk.sh >> $ERRLOG
HGRC=$?
if [ $SVNRC -ne 0 ] || [ $HGRC -ne 0 ]; then
    mail joel < $ERRLOG
fi

rm $ERRLOG

You can download these from my homepage. I welcome comments and improvements.

%d bloggers like this: