http://invisible-island.net/
Copyright © 1996-2013,2014 by Thomas E. Dickey


DIFFSTAT – make histogram from diff-output

Synopsis

diffstat reads the output of diff and displays a histogram of the insertions, deletions, and modifications per-file. It is useful for reviewing large, complex patch files.

History

I originally wrote this in 1992, along with an associated utility rcshist, to trace the change history of collections of files. Since then, I've found it most useful for summarizing source patches.

See the changelog for details:

Impact

Initially, I used diff and diffstat in a script named diff-patch. In 1994, I started using makepatch which gave more consistent results.

It was not until early 1996 that there was much attention by others to the tool. At that point, developers on both XFree86 and ncurses mailing lists started using it.

One of those developers (Tony Nugent) pointed it out to Linus Torvalds in July 1996, on linux.dev.kernel. Much later (in 2002), it was documented as part of the process for submitting Linux kernel patches for BitKeeper (BK) in Linux 2.4.20. Linus commented on the process:

Ok, pulled. But _please_ do this the regular way next time. There's even a script to help you do it in linux/Documentation/BK-usage/bk-mak-sum, which does it all for you for BK patches.

(many people end up doing their own thing, you don't have to use that particular script, of course. But the important thing I want is that the _email_ should contain enough information to make a good first pass judgement on what the patch does, and in particular it is important for me to see what a "bk pull" will actually change.)

That's why the "diffstat" is important to me if I do a BK pull – and why I want to see the patches as plaintext if I apply stuff to generic files..

Later, in 2005 Linus wrote git, which has the ability to generate a diffstat. There are some enhancements (git is able to track moves and renames of files).

Nuisances

Bash-dependency

Ubuntu #209537). introduced a misfeature. Briefly, it checks if a COLUMNS environment variable is set, and uses whatever value atoi decodes to override the default of 80 columns for the report width. My advice was overruled (the bug report offers a disingenuous reason—see this for the context in which the remarks were made).

There is more than one reason why that is not a suitable change:

The change was applied to the Debian package two years later, without discussion immediately after a change of package maintainers, (see Debian #588876). A user pointed out part of the problem with the change in Debian #697696, but made no headway with yet another maintainer.

Licensing

I changed the copyright notice of diffstat to use MIT-X11 licensing at the beginning of 1998 (version 1.26). Before that, I had used the same wording as I did in other works distributed from 1994 onward, e.g., the resizeterm patch. The reason for this change was likely prompted by my work to relicense ncurses, but also taking into account an old (October 1996) discussion with Joey Hess.

The license is (of course) given in full as a comment at the top of the files which comprise the program. Nothwithstanding this, some packagers find it inconvenient to cite the license properly. Here are a few examples:

Documentation


Download

Packages for diffstat

Version control systems

Version control systems which have implemented diffstat's include

Some are slower:

A few tools extend one or more of the version control systems, enabling their diffstat features to be used via the tool:

Other Uses

Besides imitating diffstat, there are embedded uses of the original tool:

Other implementations