http://invisible-island.net/personal/
Copyright © 2009–2020,2023 by Thomas E. Dickey

Change-Logs...

(top)
Introduction
Background
Keep it Simple
Just the Facts
Dates, of course
Contribution Categories
Problems in Categorization
Examples
Other changelogs
Other metrics
Reciprocity

Introduction

This is an overview of the guidelines which I use in maintaining change-logs and similar information for computer programs.

One of the things that the maintainer does (or used to) is to keep the change-log up-to-date. Though I have been developing software for some time, it wasn't until 1992 that a combination of circumstances (declining in-house development opportunities, and the Internet) prompted me to provide fixes for "free" software.

By 1994, I had contributed changes to about 65 programs. In that process, I had of course encountered various personalities. But the worst of those were simply slow to incorporate the changes.

Starting in 1994, I arranged to have the programs which I had been developing for my personal use excluded from my employee agreement. These included ded (the motivation for the resizeterm function), vile and tin as well as related programs. One of the related programs was ncurses.

The case with ncurses was ... different. Rather than a single developer, there were two. And they used a mailing list, unlike most. The nominal maintainer was Zeyd Ben-Halim, who was rather nonresponsive. The result of submitting patches was not good—it seems that they intended to copyright everything for themselves. That's a workable situation if they wrote everything themselves. They did not.

For instance, they incorporated Juergen Pfeifer's libraries in 1995, which greatly increased the size of ncurses. After incorporating it added 11,183 lines of code (pcurses was just under 10,000 lines of code before it became ncurses). In 1.9.7a, Juergen's name appeared in 3 places in those libraries (two pro-forma README's and one comment in a makefile noting that optimization did not work properly). Zeyd and Eric's copyright notice appeared in the same files 36 places.

The NEWS file notes:

* integrated Juergen Pfeifer's forms library.
* integrated Juergen Pfeifer's menu code into the distribution.

I noticed that patches were sent to the mailing list (including my own) and that the NEWS file would include the change, but not mention the contributor. My name appears in the NEWS file twice, as well, for that time period, though—as I pointed out later—I had done about half of the work. Not all of my changes were mentioned, and most of them were unattributed. The casual reader would assume that Eric and Zeyd did almost all of the work.

Zeyd, being the nominal maintainer, appears to have done most of the edits to NEWS. However ESR also sent changes to the mailing list incorporating changes from others without mentioning this in his announcements.

After I stopped sending patches to Eric and Zeyd in April 1996 (and providing ncurses, myself), I resolved to maintain the NEWS file with attribution for each contributor. That's the way we were doing it in vile and tin, for example. Philippe De Muyter suggested that I also note who reported the problem to be fixed as well. I did that. I began doing that a few weeks later, in late April.

Keep it Simple

Of course you're keeping your project in some type of revision control system. You can extract that information with various tools and render it as a change-log. Any idiot can do that.

Unfortunately, many change-logs are automatically generated, and indeed appear to have been generated by "any idiot".

Just the Facts

What is missing in many automatically-generated change-logs is the information which is typically not supplied by developers:

why the change was made, and
who reported the problem

One advantage of automatically-generated change-logs is that it is possible to get the dates on which changes were made. Not all automatically-generated logs show this, but it is a strong possibility.

Whether or not the change-logs are automatically-generated, there is an additional problem if changes are collected and applied by a project maintainer—recording the contributors consistently.

Dates, of course

Change-logs should have dates, to establish when a change was made.

Developers who do not supply dates on their changelogs have been known to “fix” problems with a release without noting the fact. Besides that nuisance, developers who omit dates tend to be sloppy about facts in other ways.

Contribution Categories

There are of course changes by primary contributors.

patch by

The patch is usable without rework required.

Often, for conciseness, the "patch by" is left out and only the name of the contributor given. They are equivalent.

As a rule, if I am applying a contributor's patch which (aside from formatting details) works properly, I use the rcs "-w" option to mark that revision as originating from that person. It is rare that patches good enough for this come from completely anonymous developers, so an appropriate string is seldom lacking.

Most patches require rework or adaptation.

integrated patch

The patch requires work, e.g,. it is not ifdef'd as required for all optional features.

adapted from patch

The patch has some logic flaw, requires modification to build and work.

analysis by...

Someone told how to go about fixing the problem, or else they provided a detailed enough report that the solution was apparent to the developer. This may/may not be the same person who reported the problem.

discussion with...

A discussion with someone brought out an idea, but it is unclear who was the source.

prompted by discussion with...

Talking to someone prompted me to realize a bug or solution. Without their input, the idea/fix would not have been apparent.

Occasionally their report and discussion is completely incorrect, but the "prompt" was useful. This does not apply to hostile or untruthful contributors of course.

In some cases, someone provides a suggested patch, but if it is unsuitable beyond illustrating the problem which was being discussed, then the changelog may read “prompted by patch...” while the actual implementation is different.

reported by...

Someone reported the problem, but did not provide the solution. That is, most people would not regard these as contributors, but a source of information which has to be investigated. When computing metrics, I do not count these, nor the closely related "prompted by", etc.

These categories are oriented toward direct communication with the program's maintainers. Accounting for indirect contributions is not as straightforward.

Problems in Categorization

There are a few basic problems to address:

assigning credit for indirect contributions
ensuring that contributions are assigned accurately
ensuring that contributions can be distributed in the same license

Bug-tracking systems

Bug-tracking systems are a major source of indirect contributions.

If all of the report is within the bug-tracking system, and there is no analysis by other people, nor proposed (useful) fixes, then I will cite only the bug-tracking system and its number for the bug.

On the other hand, if there are useful direct contributions toward the solution (reports without analysis are indirect), then I will cite those individuals in addition to the bug-tracking information.

Updates of Bundled Sources

A few files (such as config.guess and config.sub) are maintained by other developers. The changelog for these says "updated", and if the origin is volatile (the config.* scripts are a good example of this) or relatively obscure, says where it was found. Read their changelog for credits.

Hostile/untruthful contributors

Bear in mind that I am not a public service.

I get some reports indirectly, via web-searches in various forums. Some of the comments are useful, others partly (because they point out details for an issue). However, it is not uncommon for those to be mixed in with secondhand comments. As is usual with hearsay, much of it is inaccurate, and much of the repetition in public forums is not intended to be constructive commentary.

Still, an occasional comment is useful.

Of course, in this case, I will categorize it as "adapted from", etc., noting that it makes it automatically an indirect contribution rather than a direct contribution.

If the information is from a discussion between different individuals, none of whom appears to be knowledgeable about the issue, I will simply cite the group where the information was given.

People who attempt to use bug-reporting systems as a soapbox fall into this category, of course. For those unfamiliar with the term, this refers to a variety of misbehavior, including:

insisting on raising the criticality of a bug report to attempt to bludgeon the developer into making some proposed change. Because I will not work on a bug report before agreeing on what the problem is, and how important it is, the report is dead right at that moment.
making speechs in the bug-tracking system to the effect that some aspect of the program's design should have been done differently. The speech (might be) ignored, provided that there is a workable patch provided by that individual which addresses the issue without impacting other users.

As a caveat, not all “bug-tracking” systems are equal. Granted, bug-reports are not always welcome. But the bug-tracking system has to be reliable:

publicly accessible, and
not readily revised to present a different story at a later time.

The issues-tracking systems provided with github and gitlab (writing this in May, 2019) are not reliable because changes to comments are not visible to others. In some cases, the project maintainers can (and some do) readily delete and modify comments to adjust a story to their advantage.

Anonymous contributors

Anonymous reports are not uncommon. Useful fixes from anonymous people are much less common (see discussion). When considering these, there are several factors to consider:

whether the alias is just a “pen name” for a readily identified developer, or
whether the alias is recent, or longstanding,
whether the fixes are obviously useful, or debatable.

For the pen names, I cite the actual (or apparent) name.

On occasion I get suggested fixes which are neither from a readily identified person, or fit into the design of whatever program is being discussed. For those, I may adapt the change.

Anonymous or “not” I do not use bug reports containing information from Wikipedia:

where it is accurate, it has been copied from a more useful source (usually tweaking to avoid Wikipedia's low threshold on verbatim copying to avoid copyright infringement in contrast to plagiarism), and
where it is not accurate, someone has spent time to make up facts, for whatever reason.

Examples

Not all of the change-logs are in the same textual format. I wrote a script which handles the most common cases, and have massaged some change-logs to follow the format which it recognizes, to collect information about contributors. Essentially, it reads the text, looking for the markers which I use to denote direct- and indirect-contributions, and gives totals and names for the direct contributions.

For some (lynx and vile) I have not reformatted the older change-logs. In those cases, the dates below correspond to the beginning of the change-logs that I have reformatted.

With vile, I may do this (reformat the logs) sometime, since I have software archives to its beginning in August 1990, and the changelogs identify all contributions.
Lynx is harder, since the changelogs for 2.4 through 2.6 have 5-20 percent of their entries without an identifiable author. Most of the entries in the 2.3 changelog are unattributed. Also, there was no software archive in use until Klaus Weide put one together in 1997, using PRCS.

As in ncurses, an attempt to give statistics for those changelogs would probably be unfair to the contributors whose work was not deemed a major change.

Here is a list (as/of May 2010) of the change-logs for which I have useful metrics, noting the percentage for my own contributions, and the number of other contributors (disregarding "external", since there is no active involvement).

Program	Percent	Other	Date
diffstat	81	12	June 1994
xterm	83	150	January 1996
ncurses	76	176	April 1996
vttest	96	3	June 1996
lynx	45	136	February 1997
vile	76	36	November 1999
dialog	78	64	December 1997
cdk	85	24	May 1999
byacc	97	4	February 2002
luit	91	0	August 2006
mawk	73	6	September 2008

Other changelogs

I use rcs2log for a few programs (ded, (byacc, autoconf macros, etc), which did not have a history of other contributors, and/or which are very stable.

The number of changes shown by rcs2log is different from the conventional change-logs:

In practice, there are many minor changes which would be just clutter in a change-log.
Changes which are adapted or otherwise not accepted as is do not use the contributor's name on the check-in.
Of course, contributors keep their own records, which differ in granularity as well. A good change-log is a good compromise which tells the story.

Here is a more complete table, from May 2017 which shows both sets of data where applicable (and excluding programs such as ded which have no other contributors):

Program	Log started	Manually edited				rcs2log generated
Program	Log started	Changes	By me	Percent	Others	Changes	By me	Percent	Others
diffstat	1994/06	135	101	74.8%	17	584	566	96.9%	10
cproto	1994/08	153	139	90.8%	3	935	889	95.1%	4
xterm	1996/01	2271	1897	83.5%	185	12106	11848	97.9%	89
ncurses	1996/04	5045	3926	77.8%	235	19483	18365	94.3%	175
vttest	1996/06	215	203	94.4%	5	1338	1333	99.6%	3
bcpp	1996/10	175	163	93.1%	9	496	495	99.8%	1
lynx	1997/01	3565	1874	52.6%	135	4323	4185	96.8%	48
dialog	1997/12	830	645	77.7%	81	4068	3974	97.7%	42
cdk	1999/05	470	372	79.1%	38	2334	2261	96.9%	24
vile	1999/11	2889	2717	94.0%	48	15140	14008	92.5%	51
cdk-perl	2001/01	39	36	92.3%	2	189	185	97.9%	3
byacc	2002/02	N/A	N/A	N/A	N/A	826	737	89.2%	13
luit	2006/08	114	103	90.4%	2	960	957	99.7%	3
mawk	2008/09	234	179	76.5%	10	1534	1485	96.8%	7

Other metrics

There are other ways to measure contributions. Not all of them work as well as inspecting the change-log.

For instance, the Orbiten survey several years ago ignored the change-logs and RCS identifiers in my projects, and credited virtually all of my work to other people. Some of those credited were never contributors. Rather, Orbiten noted the mention of various individuals and organizations in README's and comments, and credited them with the entire work.

Other people have pointed out that Orbiten also did not factor out programs such as libtool, which are bundled with other programs.

Any metric requires inspection and tuning to validate the results. Lacking that step, the metric is worthless.

Reciprocity

Unsurprisingly enough, my change-logs cite contributions from people who also maintain change-logs. They do not necessarily reciprocate, e.g., some developers who borrow from my work. I don't work with those people.