Mercurial repository clones can have two parts:
- An
.hg/subdirectory, where all the repository metadata is stored - A “working copy” area, where checked out files may live
The .hg/ subdirectory stores the repository metadata of the specific clone, including the history of all changesets stored in the specific clone, clone-specific hooks and scripts, information about local tags and bookmarks, and so on. This is the only part of a Mercurial repository that is actually mandatory for a functional repository.
The “working copy” area is everything under the clone that is not under the toplevel .hg/ subdirectory of the particular clone. The working area of each Mercurial repository may contain a snapshot of the files stored in the repository: either a clean snapshot, checked out from one of the changesets stored in the repository itself, or a locally modified version of a changeset.
One important detail that may not be apparent from the descriptions above is that:
Even if you have already checked out a particular version, you can delete everything except the
.hg/subdirectory and the Mercurial repository will still function normally.
Clones Without a Working Copy
An example is a good way to demonstrate how a clone still functions as a Mercurial repository without a working copy. Let’s assume that you have a tiny repository at /tmp/hgdemo that contains revisions of just a small hello.c program:
% pwd /tmp/hgdemo % hg root /tmp/hgdemo % hg log --style compact 1[tip] c48ee3a9fd78 2010-01-11 08:33 +0200 keramida Use EXIT_SUCCESS instead of hard-coded zero. 0 041227edc91b 2010-01-11 08:32 +0200 keramida Add hello.c % hg manifest tip hello.c %
You can check-out a copy of the latest file revision of hello.c with the “hg checkout” command:
% hg checkout --clean tip
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
% cat -n hello.c
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int
5 main(void)
6 {
7 printf("Hello world\n");
8 return EXIT_SUCCESS;
9 }
%
The repository does not need a checkout to function though. The fact that your working copy has been updated to a particular revision is independent of the way the repository machinery under .hg/ works. So you can remove the source of hello.c and still use the repository to browse the history of the project:
% rm -f hello.c % hg log --style compact 1[tip] c48ee3a9fd78 2010-01-11 08:33 +0200 keramida Use EXIT_SUCCESS instead of hard-coded zero. 0 041227edc91b 2010-01-11 08:32 +0200 keramida Add hello.c %
With a clone like this it is still possible to use any Mercurial command that does not require a working copy, e.g. “hg diff” to look at the differences between two arbitrary revisions:
% hg diff -r 0:1
diff -r 041227edc91b -r c48ee3a9fd78 hello.c
--- a/hello.c Mon Jan 11 08:32:59 2010 +0200
+++ b/hello.c Mon Jan 11 08:33:28 2010 +0200
@@ -1,8 +1,9 @@
#include <stdio.h>
+#include <stdlib.h>
int
main(void)
{
printf("Hello world\n");
- return 0;
+ return EXIT_SUCCESS;
}
%
You can even checkout the “null” revision (a magic revision name which Mercurial treats as “not any revision stored in this repository”):
% hg checkout --clean null 0 files updated, 0 files merged, 0 files removed, 0 files unresolved % hg identify --id --branch 000000000000 default
When a Mercurial clone has checked out the null revision all the tracked files of the working copy are removed. If the clone does not already contain build-time artifacts you should only see the .hg/ subdirectory when you look at the clone:
% find . -maxdepth 2 -exec /bin/ls -1 -dF {} +
./
./.hg/
./.hg/00changelog.i
./.hg/branch
./.hg/dirstate
./.hg/last-message.txt
./.hg/requires
./.hg/store/
./.hg/tags.cache
./.hg/undo.branch
./.hg/undo.dirstate
%
The disk space such clone requires is limited by the size of the history metadata.
Why Would You Want Such a Clone
For a small repository like the one shown in this example, it seems pretty useless to be able to have a Mercurial clone without a working copy. You don’t really gain much by deleting the source of a small 9-line C program. The space savings of doing that are quite insignificant.
If you are, however, hosting clones of large repositories in a web server somewhere, stripping the working copy of Mercurial clones may be very handy indeed and it may save you a large part of the disk space you would need to keep working copies around. By “large repository” I mean something like a single clone with several hundreds or thousands of files, or a clone whose working copy requires tens or hundreds of megabytes of data.
The OpenSolaris onnv-gate repository is one of the large repositories that use Mercurial. My own Mercurial-based mirror of the FreeBSD head branch is another example for which I readily have size data. Size information for these two repositories is shown in the table below:
| FreeBSD head/ branch since 2008-01-01 |
OpenSolaris onnv-gate repository |
|
|---|---|---|
| Tracked files | 41.807 | 44.784 |
| Changesets | 15.513 | 11.462 |
| Size of .hg store | 238 MB | 292 MB |
| Size of working copy | 385 MB | 543 MB |
Both of these Mercurial repositories have a moderately large number of files. It’s also important that the size of the working copy exceeds the size of the .hg/ repository store in both cases. In the onnv-gate repository of OpenSolaris the working copy needs almost twice as much as the entire history of the project. That’s a lot of disk space to carry around in all your local clones of onnv-gate!
If all you are looking for is a local mirror of the project sources — so that you can look at the history of a project, browse the diffs committed over time, search for interesting commit information (e.g. “when was bug 6801336 fixed in OpenSolaris?”) — carrying around a full working copy is probably a waste of space. Updating the files of the working copy after every pull operation from the upstream master-repository is a waste of time too.
Posted in Computers, Free software, FreeBSD, Mercurial, Open source, Programming, SCM, Software Tagged: Computers, Free software, FreeBSD, hellug, Mercurial, Open source, Programming, SCM, Software



