There are many times when I look back in FreeBSD’s source code control history to better understand a particular piece of code I am working on. I can better understand why a section of code is written the way it is by seeing the changes in that section of code across time along with the rationale for those changes in the log messages. Sometimes while doing this I run into a wall since FreeBSD’s history stops at the import of the 4.4BSD sources from CSRG. I have often longed to dive farther back into the code’s history.
The good news is that the folks at CSRG maintained their own source code repository using SCCS. Dr. Marshall Kirk McKusick sells a 4-disc CD set that contains the SCCS repository (along with several other bits). SCCS is a bit awkward to work with, however, so recently I derived a Subversion repository from the SCCS repository.
The main tool I used was sccs2svn.py. This script required a couple of FreeBSD ports to be installed: devel/py-subversion and devel/cssc. In addition, I had to patch sccs2svn.py to fix several things:
- Subversion 1.7 requires client.svn_client_create_context() to create a context rather than client.svn_client_ctx_t().
- Fixups to correct the timestamp on each commit in SVNInterface._commit() were applied to the previous revision rather than the revision being committed.
- One CSRG commit contained CR (‘\r’) characters in a log message that had to be converted to LF (‘\n’) to be accepted by Subversion.
- One SCCS file contained a corrupted log message that was not valid UTF-8. I used this patch to get past the failure so I could identify and fix the corrupted log message.
- By default, the sccs2svn.py script makes two passes over the entire repository at the end of the conversion process applying fixups to SCCS keywords and ID strings. These passes are intended to allow for future development in the resulting Subversion repository. However, in this case of converting an historical repository, these passes merely added noise and obfuscation. I added a new –pure option to disable these passes.
The command line used to run the conversion was:
python sccs2svn.py -u sccs2svn -o /home/csrg/svn -i /home/csrg/sccs -p
A few of the SCCS files contained errors (single-bit errors) and the repository also contained a few “bad” SCCS files that muddled the results. For the “bad” SCCS files I simply removed them from the copy of the SCCS repository I used for the conversion. The following errors I fixed by hand:
- usr.bin/pascal/src/SCCS/s.main.c had a corrupted status line for its initial revision where a ’0′ (0×30) had been replaced with a ‘p’ (0×70).
- usr.bin/pascal/pdx/machine/SCCS/s.printerror.c had a corrupted status line for its initial revision where a ’5′ (0×35) had been replaced by a ‘^U’ (0×15).
- usr.bin/pascal/pdx/machine/SCCS/s.printerror.c had a corrupted log message for its 1.2 revision where “init()” had been replaced by “i\xee\xe9t()”. That is, ‘n’ (0x6e) had been replaced by 0xee, and ‘i’ (0×69) had been replaced by 0xe9.