Copyright © 2009-2012,2013 by Thomas E. Dickey
This is an overview of the guidelines which I use in maintaining change-logs and similar information for computer programs.
One of the things that the maintainer does (or used to) is to keep the change-log up-to-date. Though I've been developing software for some time, it wasn't until 1992 that a combination of circumstances (declining in-house development opportunities, and the Internet) prompted me to provide fixes for "free" software.
By 1994, I had contributed changes to about 65 programs. In that process, I had of course encountered various personalities. But the worst of those were simply slow to incorporate the changes.
Starting in 1994, I arranged to have the programs which I had
been developing for my personal use excluded from my employee
agreement. These included ded (the
motivation for the
resizeterm function), vile and tin as well as related programs. One of the
related programs was ncurses.
The case with ncurses was ... different. Rather than a single developer, there were two. And they used a mailing list, unlike most. The nominal maintainer was Zeyd Ben-Halim, who was rather nonresponsive. The result of submitting patches was not good—it seems that they intended to copyright everything for themselves. That's a workable situation if they wrote everything themselves. They did not.
For instance, they incorporated Juergen Pfeifer's libraries in 1995, which greatly increased the size of ncurses. After incorporating it added 11,183 lines of code (pcurses was just under 10,000 lines of code before it became ncurses). In 1.9.7a, Juergen's name appeared in 3 places in those libraries (two pro-forma README's and one comment in a makefile noting that optimization did not work properly). Zeyd and Eric's copyright notice appeared in the same files 36 places.
The NEWS file notes:
* integrated Juergen Pfeifer's forms library. * integrated Juergen Pfeifer's menu code into the distribution.
I noticed that patches were sent to the mailing list (including my own) and that the NEWS file would include the change, but not mention the contributor. My name appears in the NEWS file twice, as well, for that time period, though—as I pointed out later—I had done about half of the work. Not all of my changes were mentioned, and most of them were unattributed. The casual reader would assume that Eric and Zeyd did almost all of the work.
Zeyd, being the nominal maintainer, appears to have done most of the edits to NEWS. However ESR also sent changes to the mailing list incorporating changes from others without mentioning this in his announcements.
After I stopped sending patches to Eric and Zeyd in April 1996 (and providing ncurses, myself), I resolved to maintain the NEWS file with attribution for each contributor. That's the way we were doing it in vile and tin, for example. Philippe De Muyter suggested that I also note who reported the problem to be fixed as well. I did that.
Of course you're keeping your project in some type of revision control system. You can extract that information with various tools and render it as a change-log. Any idiot can do that.
Unfortunately, many change-logs are automatically generated, and indeed appear to have been generated by "any idiot".
What is missing in many automatically-generated change-logs is the information which is typically not supplied by developers:
One advantage of automatically-generated change-logs is that it is possible to get the dates on which changes were made. Not all automatically-generated logs show this, but it is a strong possibility.
Whether or not the change-logs are automatically-generated, there is an additional problem if changes are collected and applied by a project maintainer—recording the contributors consistently.
There are of course changes by primary contributors.
Often, for conciseness, the "patch by" is left out and only the name of the contributor given. They are equivalent.
As a rule, if I am applying a contributor's patch which (aside from formatting details) works properly, I use the rcs "-w" option to mark that revision as originating from that person. It is rare that patches good enough for this come from completely anonymous developers, so an appropriate string is seldom lacking.
Most patches require rework or adaptation.
Occasionally their report and discussion is completely incorrect, but the "prompt" was useful. This does not apply to hostile or untruthful contributors of course.
These categories are oriented toward direct communication with the program's maintainers. Accounting for indirect contributions is not as straightforward.
There are a few basic problems to address:
Bug-tracking systems are a major source of indirect contributions.
If all of the report is within the bug-tracking system, and there is no analysis by other people, nor proposed (useful) fixes, then I'll cite only the bug-tracking system and its number for the bug.
On the other hand, if there are useful direct contributions toward the solution (reports without analysis are indirect), then I'll cite those individuals in addition to the bug-tracking information.
A few files (such as
config.sub) are maintained by other developers. The
changelog for these says "updated", and if the origin is volatile
(the config.* scripts are a good example of this) or relatively
obscure, says where it was found. Read their changelog for
Bear in mind that I'm not a public service.
I get some reports indirectly, via web-searches in various forums. Some of the comments are useful, others partly (because they point out details for an issue). However, it is not uncommon for those to be mixed in with secondhand comments. As is usual with hearsay, much of it is inaccurate, and much of the repetition in public forums is not intended to be constructive commentary.
Still, an occasional comment is useful.
Of course, in this case, I'll categorize it as "adapted from", etc., noting that it makes it automatically an indirect contribution rather than a direct contribution.
If the information is from a discussion between different individuals, none of whom appears to be knowledgeable about the issue, I will simply cite the group where the information was given.
In general, we would assume that developers submit their own work. This is not always true.
When reviewing a change, I do take the time to scrutinize it, attempt to determine a proper attribution for the change. It happens that I may notice (or recall, if I'm subscribed to a given mailing list) that the change was originally developed by a different individual. In that case, I'll amend the description to cite the actual developer. If the code has a comment citing the developer, that suffices, though even that has been a matter of dispute on occasion, when the intermediary insists on sharing the credit.
Individuals who do this repeatedly (there are a few) will either be banned, or subject to scrutiny on every change. In either case, they generally go away and provide their services to a different project. Rather than leave, some of these use the public bug-tracking systems as a forum.
Change-logs should have dates, to establish when a change was made.
Not all of the change-logs are in the same textual format. I wrote a script which handles the most common cases, and have massaged some change-logs to follow the format which it recognizes, to collect information about contributors. Essentially, it reads the text, looking for the markers which I use to denote direct- and indirect-contributions, and gives totals and names for the direct contributions.
For some (lynx and vile) I have not reformatted the older change-logs. In those cases, the dates below correspond to the beginning of the change-logs that I have reformatted.
As in ncurses, an attempt to give statistics for those changelogs would probably be unfair to the contributors whose work was not deemed a major change.
Here is a list (as/of May 2010) of the change-logs for which I have useful metrics, noting the percentage for my own contributions, and the number of other contributors (disregarding "external", since there is no active involvement).
rcs2log for a few programs (ded, (byacc, autoconf macros,
etc), which did not have a history of other contributors, and/or
which are very stable.
There are other ways to measure contributions. Not all of them work as well as inspecting the change-log.
For instance, the Orbiten survey several years ago ignored the change-logs and RCS identifiers in my projects, and credited virtually all of my work to other people. Some of those credited were never contributors. Rather, Orbiten noted the mention of various individuals and organizations in README's and comments, and credited them with the entire work.
Other people have pointed out that Orbiten also did not factor
out programs such as
libtool, which are bundled with
Any metric requires inspection and tuning to validate the results. Lacking that step, the metric is worthless.
Unsurprisingly enough, my change-logs cite contributions from people who also maintain change-logs. They do not necessarily reciprocate, e.g., some developers who borrow from my work. I don't work with those people.