Category Archives: SCM

What keramida said…

Mercurial repository clones can have two parts:

  1. An .hg/ subdirectory, where all the repository metadata is stored
  2. A “working copy” area, where checked out files may live

The .hg/ subdirectory stores the repository metadata of the specific clone, including the history of all changesets stored in the specific clone, clone-specific hooks and scripts, information about local tags and bookmarks, and so on. This is the only part of a Mercurial repository that is actually mandatory for a functional repository.

The “working copy” area is everything under the clone that is not under the toplevel .hg/ subdirectory of the particular clone. The working area of each Mercurial repository may contain a snapshot of the files stored in the repository: either a clean snapshot, checked out from one of the changesets stored in the repository itself, or a locally modified version of a changeset.

One important detail that may not be apparent from the descriptions above is that:

Even if you have already checked out a particular version, you can delete everything except the .hg/ subdirectory and the Mercurial repository will still function normally.

Clones Without a Working Copy

An example is a good way to demonstrate how a clone still functions as a Mercurial repository without a working copy. Let’s assume that you have a tiny repository at /tmp/hgdemo that contains revisions of just a small hello.c program:

% pwd
/tmp/hgdemo
% hg root
/tmp/hgdemo
% hg log --style compact
1[tip]   c48ee3a9fd78   2010-01-11 08:33 +0200   keramida
  Use EXIT_SUCCESS instead of hard-coded zero.

0   041227edc91b   2010-01-11 08:32 +0200   keramida
  Add hello.c

% hg manifest tip
hello.c
%

You can check-out a copy of the latest file revision of hello.c with the “hg checkout” command:

% hg checkout --clean tip
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
% cat -n hello.c
     1  #include <stdio.h>
     2  #include <stdlib.h>
     3  
     4  int
     5  main(void)
     6  {
     7      printf("Hello world\n");
     8      return EXIT_SUCCESS;
     9  }
%

The repository does not need a checkout to function though. The fact that your working copy has been updated to a particular revision is independent of the way the repository machinery under .hg/ works. So you can remove the source of hello.c and still use the repository to browse the history of the project:

% rm -f hello.c
% hg log --style compact
1[tip]   c48ee3a9fd78   2010-01-11 08:33 +0200   keramida
  Use EXIT_SUCCESS instead of hard-coded zero.

0   041227edc91b   2010-01-11 08:32 +0200   keramida
  Add hello.c

%

With a clone like this it is still possible to use any Mercurial command that does not require a working copy, e.g. “hg diff” to look at the differences between two arbitrary revisions:

% hg diff -r 0:1
diff -r 041227edc91b -r c48ee3a9fd78 hello.c
--- a/hello.c   Mon Jan 11 08:32:59 2010 +0200
+++ b/hello.c   Mon Jan 11 08:33:28 2010 +0200
@@ -1,8 +1,9 @@
 #include <stdio.h>
+#include <stdlib.h>
 
 int
 main(void)
 {
     printf("Hello world\n");
-    return 0;
+    return EXIT_SUCCESS;
 }
%

You can even checkout the “null” revision (a magic revision name which Mercurial treats as “not any revision stored in this repository”):

% hg checkout --clean null
0 files updated, 0 files merged, 0 files removed, 0 files unresolved
% hg identify --id --branch
000000000000 default

When a Mercurial clone has checked out the null revision all the tracked files of the working copy are removed. If the clone does not already contain build-time artifacts you should only see the .hg/ subdirectory when you look at the clone:

% find . -maxdepth 2 -exec /bin/ls -1 -dF {} +
./
./.hg/
./.hg/00changelog.i
./.hg/branch
./.hg/dirstate
./.hg/last-message.txt
./.hg/requires
./.hg/store/
./.hg/tags.cache
./.hg/undo.branch
./.hg/undo.dirstate
%

The disk space such clone requires is limited by the size of the history metadata.

Why Would You Want Such a Clone

For a small repository like the one shown in this example, it seems pretty useless to be able to have a Mercurial clone without a working copy. You don’t really gain much by deleting the source of a small 9-line C program. The space savings of doing that are quite insignificant.

If you are, however, hosting clones of large repositories in a web server somewhere, stripping the working copy of Mercurial clones may be very handy indeed and it may save you a large part of the disk space you would need to keep working copies around. By “large repository” I mean something like a single clone with several hundreds or thousands of files, or a clone whose working copy requires tens or hundreds of megabytes of data.

The OpenSolaris onnv-gate repository is one of the large repositories that use Mercurial. My own Mercurial-based mirror of the FreeBSD head branch is another example for which I readily have size data. Size information for these two repositories is shown in the table below:

  FreeBSD head/ branch
since 2008-01-01
OpenSolaris
onnv-gate repository
Tracked files 41.807 44.784
Changesets 15.513 11.462
Size of .hg store 238 MB 292 MB
Size of working copy 385 MB 543 MB

Both of these Mercurial repositories have a moderately large number of files. It’s also important that the size of the working copy exceeds the size of the .hg/ repository store in both cases. In the onnv-gate repository of OpenSolaris the working copy needs almost twice as much as the entire history of the project. That’s a lot of disk space to carry around in all your local clones of onnv-gate!

If all you are looking for is a local mirror of the project sources — so that you can look at the history of a project, browse the diffs committed over time, search for interesting commit information (e.g. “when was bug 6801336 fixed in OpenSolaris?”) — carrying around a full working copy is probably a waste of space. Updating the files of the working copy after every pull operation from the upstream master-repository is a waste of time too.


Posted in Computers, Free software, FreeBSD, Mercurial, Open source, Programming, SCM, Software Tagged: Computers, Free software, FreeBSD, hellug, Mercurial, Open source, Programming, SCM, Software

Experimenting with Mercurial “named branches”


As an experiment with the “named branch” support of Mercurial (Hg hereafter), I’ve started updating the editors/emacs-devel port of FreeBSD, using an Hg repository with two branches:

  • HEAD is the main branch where history is imported from the official FreeBSD CVS repository
  • keramida is a named branch where my own, local changes are committed

The experiment seems to be going pretty well so far, and the port has been updated to a CVS snapshot of the GNU Emacs source tree obtained at 1 Jan 2008, 21:19:17 UTC. You can see the Hg repository with the two named branches at:

http://hg.hellug.gr/keramida/ports/emacs-devel/

I’ll keep the converted port repository around, and see how future updates work. I’m really interested to see what happens with “merges” of upstream code, after the current “keramida” branch has been committed upstream, to the official FreeBSD ports/ repository :-)

Greek FreeBSD doc/ update


Another update of our base freebsd/doc Mercurial tree has been pushed to hg.hellug.gr.

The bundles for populating new Mercurial trees with the changes are being uploaded to freefall as I’m typing this post. They should be available in a short while from:

bsd.hg-2007.12.02.20

A bundle of the clean imports done from FreeBSD doc/

el.hg-2007.12.02.20

A bundle of the merged sources, including all the changesets I have pulled from other translators

Happy translating!