Copyright © 2016 by Thomas E. Dickey
I have been tracking my projects using RCS since the late 1980s. I used SCCS for a few years before then. Developers who use my programs are accustomed to using patches and tarballs to track changes to these files, using their own procedures.
On the other hand, there are other people who would like to casually browse the source history. Not all of them are developers.
While I was working with XFree86, occasionally someone would ask where they could find a source-history for xterm and I would point to XFree86's CVS. I stopped committing to that in mid-2006. That finally became defunct in 2016 (ten years later).
Providing my own CVS (or whatever) web-accessible respository would take away development time and be expensive. On rare occasions, someone else would offer to do some part of this:
Git is a little more flexible than tar-balls:
Those were helpful, but did not complete the task. In the case of ncurses, those made web-accessible my weekly patches for ncurses starting with 5.6 (in 2006). During a given week, I make several changes, and the result is the weekly patch. Since most of the information would be omitted, I disregarded that.
Mawk seemed more promising. However, I ran into a few problems:
Initially, I thought that I could replay my changes onto Neider's initial git repository. That effort used a few weeks finding that git's merge capability was not flexible enough to make a reliable/scriptable procedure.
Nieder had suggested
as a tool.
My RCS wrappers (like CVS) can be used for lazy commits, i.e., no lock is needed for a file. Unlike CVS, I use the file's actual modification time for the identifiers. On checkout, the wrappers construct the identifiers.
rcs-fast-export.rb (a Ruby script) knows
nothing about lazy commits. I modified the script to call my
RCS wrappers to do the actual checkouts (which slows things
down). If I were to replace those wrappers with additional
scripting, another month or so would be needed to get the
same resulting identifiers.
In case someone questions why I want the identifiers, remind them that git uses its own identifiers, and if those were removed, git would not be useful.
The Ruby script uses a lot of memory, and was written for Ruby 1.9. Using it with Ruby 2.0 did not work well.
Different versions of git did not interoperate. Really. That's a step backward from RCS, where I can work with my archives on machines across a wide range of operating systems and versions.
The Ruby script writes directly to git's internal data structures. Those are undocumented.
The Ruby script handles only RCS archives with no branches. I use RCS branches in several of my projects.
Each export of my RCS archive using the script generates a new set of hash codes, making it impossible to transfer updates to an older export of the same archive.
Because of these problems, the Ruby script (while interesting) was not useful. I could not use it on the larger projects, even for one-shot uses such as I did for mawk:
In May 2016, someone commented that there were several forks from my byacc tar-balls on github, and that was a problem. Actually, since none of the “forks” had been improved, there was little to discuss since there were no potential changes to merge back. In any case, I would merge into my RCS archives.
But that prompted me to think about just (as in ncurses) constructing a git-ball with the labeled revisions from my RCS archives. As of mid-November 2016, I have (using rcs2log):
Sedeño's git repository has about half of those labeled revisions: 48% in mid-November 2016.
I label things when I reach a milestone, whether or not that is when I decide to release a set of changes. By exporting the labeled revisions (rather than a complete archive), the result would still be useful, as well as being practical. But keep in mind that the number of labels is far smaller than the actual number of changes.
In my revised approach, it is possible to do incremental exports from the archive onto a git-ball. Exporting all of the labels in the ncurses archive takes hours; exporting the latest revision takes a minute or so.
I wrote a script
to do most of the work, and run that from a script
r2g which knows how to manage the git-balls which I
create and update. In some projects (such as xterm, the
MANIFEST file is
generated using the manifest script. That
and other special cases are handled by
Not all of my projects had labels. I wrote another script
to label some of those using a cutoff-date. That worked for most,
but not for the older
vi-like-emacs archives which I received from Paul Fox in
1996. I labeled that by writing a custom script (like
tag-cutoff) and used that to label those files.
I have generated git-balls for all of the projects which I share with others (as well as a few which I do not). You can see the result here:
Scripts are a different matter. Those all live in common directories, from which I generate the tar-balls that you can download from my scripts page. I suppose that I could generate git-balls for those as well.
When I release changes to one of my projects, I do this:
r2gto generate an updated git-ball (the old one is renamed, keeping a backup copy), and
push2githubto update Github's copy of my git-ball.
The two steps use different keys, just in case.
Just as a reminder: these are snapshots. If someone wants to make a change, that will be done the same way as other projects, by first integrating into the RCS archives, and then exporting, updating the snapshots.