http://invisible-island.net/personal/
Copyright © 2009-2013,2014 by Thomas E. Dickey


Old, but Interesting Programs

Old, but Interesting Programs

Here is a brief description of some of the more interesting programs that I worked on, long ago. This does not count the programs that I was told to produce—some of those are interesting as well.

The source code is no longer available for these (and as was the practice, was then available only on a limited basis generally with peers). Others have claimed that source-code was just given away in the 1970s and 1980s. On the contrary, all of the vendor-related source code (DEC, Univac, IBM) which I saw was available on a limited basis — non-disclosure. Perhaps others' experience differ, or else they chose to ignore the agreement.

1970
three-dimensional tic-tac-toe.

I wrote this in Fortran on an IBM 1620 computer. It was not the first computer program that I wrote, but it was the first interesting one.

This used a 4x4x4 array.

In my initial programming class, we used a 1620 model I, with 12,000 characters of memory. Midway through that year, the college purchased a model II from a neighboring college which purchase an IBM 1130. We continued using the model II through my graduation.

The model II computer had 20,000 characters of memory (making this a low-end system, according to IBM's datasheet). I used a lookup table to store the association between positions and the ways to win, i.e., filling in a line of four positions. The combination of the program to fill the table and the game itself was too large to fit in the available memory. So I split it into two parts:

  • initialize the table, and then (keeping the table in memory), transfer control ("chain") to
  • run the game

The first program was generated (about 800 assignments) by a separate program. The game was about 200 statements. So a thousand-line program was too large for that computer.


1974
CSM – control-system modules
As a graduate student, I began programming full-time, i.e., 60 hours a week. I was supported by a project whose head said at the beginning, "We'd like to know about microprocessors".

In context, that meant in the realm of feedback control systems. But first, I needed more experience than I had at that point. So I set out to build software which could be used for constructing block-diagram modules useful for control systems. Like many first attempts, there was a lot of learning involved, both for the technology as well as learning what the conventions were. I did some odd things, before learning how valuable conventional behavior can be:

  • because I was developing using a teletype (continuous feed paper), I wanted to limit the amount of paper used. CSM decided when to advance the teletype's carriage to the next line, and treated any valid end of command the same. It did not use much paper, but the output was hard to read.
  • CSM used floating point. But (since PDP-11's tended to use octal), I did all of the numeric I/O in octal. That included octal floating point, which in turn provoked endless discussion.

Perhaps related, at that time I used to go grocery shopping once or twice a week. The cash amounts were small (about $10), and I could walk through the store, adding up the total in my head and come to the register with the correct amount (counting tax). I noticed that I was making errors occasionally, and realized that it was more likely when there was an 8 or a 9 in the result. Those are non-octal, of course.


1975
simic – cross-assemblers and simulators for more than a dozen microprocessors.

These were written in assembler on a PDP-11. My department acquired two PDP-11's, funded by research money. Initially they ran CAPS-11 (cassette tapes), but shortly after (early 1975), ran RT-11 (2.5Mb RK05 disks).

I named the cross assembler micro (a twist on macro, the name of DEC's macro assembler), and the simulator simodt (from simulated ODT, DEC's "octal debugging tool"). Combined, they were named simic. Besides making the names like DEC's, but different, I preferred to make the program's names match the names of the executable files. On RT-11 that meant no more than six characters (because filenames were stored in radix-50 format, which packed three characters into 16-bits). My initial choice for the simulation module's name (simula) was unsatisfactory since a peer pointed out that there was a well-known computer language with that name.

The cross assemblers supported macros, conditional assembly, all of the interesting features that DEC's assemblers did. Both were modular, most of the code being in common (about 80% for the simulators and above 95% for the cross-assemblers).

Using this modular approach, I wrote 11 simulators and 14 assemblers. Basically, if I was able to obtain a data sheet for it, an assembler took less than a day. Simulators took 3-4 days to write, since there was less opportunity for reusing functions.

Here is a list of microprocessors from that period for which I have a data sheet or manual:

  • Fairchild F8
  • Intel 4004
  • Intel 8008
  • Intel 8080
  • Intersil IM6100
  • MOS Technology 6502
  • Motorola 6800
  • National Semiconductor IMP-16
  • National Semiconductor IMP-8
  • Plessey miproc-16
  • Rockwell PPS-4
  • Rockwell PPS-8
  • Scientific Micro Systems MicroController
  • Texas Instruments TMS 9900

I did most of this work during summer and fall of 1975. There are a few which have no data sheets because they were not for commercially available microprocessors:

  • a processor which had been designed by Jack McDonald—my predecessor on the project—using DEC PDP-16 components aka "register transfer modules" (RTMs). I did this to help with a proposal for renewing the grant which was supporting me. The components had since been reclaimed for reuse in other projects, and what I had to work with was a summary of the instruction set, and a hand-generated listing of a sample program. I used that information to make an assembler and simulator, and fixed some errors in the sample program.
  • a wiring-list generator for RTMs. The idea was suggested by an associate, who pointed out that a wiring list generator is just a way of organizing information and annotating it, much like an assembler. He also said that it was similar to a suggested topic for independent research with an instructor who had acquired a poor reputation for several issues—one was the large fraction of students who ended up with an incomplete). Given that, I was curious to see how well I would do. I completed my proposed research.

PDP-11's had a 56Kb address space (plus 8Kb for I/O, for 16-bit addresses). That made them cheap research machines. Not all PDP-11's even had 56Kb of core memory (semiconductor memory came later). Programming them mostly in assembler was not unusual.

I redid the macro processing after a while (in 1977 or 1978), when I was working on txt. The initial version stored the parsed data (a series of tokens). However, that proved to be hard to maintain and extend. So I changed this to work purely with character strings. That made it simpler to concatenate names, perform substitutions, etc.


1976
Basic line renumbering.

I do not recall what I named this.

Someone in the lab brought in a book with a StarTrek program (think "minesweeper", on a printing terminal, at 30 characters/second). But the program was too large to run on our machine (in 24Kb). The Basic interpreter could load sections of a program at run-time, e.g., like overlays. To exploit this feature (and squeeze the program into roughly half the space), it was necessary to resequence the overlaid chunks so they reused the same line numbers.

I spent a couple of days getting the program to work – probably the first nontrivial Basic program I wrote. Then I spent a half hour or so playing with the StarTrek program. That was not as interesting, so I put StarTrek away.

dirsrt – directory sort

RT-11 was a single-user system without any frills. Hierarchical directories came later, in other systems. RT-11's directory program simply read the directory blocks and printed their contents. With a half-dozen people sharing a minicomputer, and a thousand files, it is hard to keep track. I wrote an improved directory program dirsrt which made an index to the directory entries and printed them, sorted by any combination of name, extension, size, date. It also had a switch for writing a blank line between groups, e.g., DIRSRT /A/X/J would show files sorted by name within extensions, skipping a line between distinct extensions.

a subdirectory mechanism

As noted, RT-11 had only a flat directory structure. But several researchers shared a few machines, and there was a lot of clutter. I wrote a program which simulated RT-11's directory structure inside a large file.

It was interesting, but did not really solve the problem, since RT-11 did not have a way to create virtual device drivers. I could copy files in/out, and do directory listings; but to use the files, they had to be outside the container.

At the time, an associate commented that the concept was something like interchange-format, a reference to a program which has been obscured by other uses.

Another person said it was like a partitioned dataset.

In either case, I did learn something.

idl – an interface description language

Users of my microprocessor simulators complained that they were missing a useful feature: the ability to do I/O. I designed a simple language to describe ports, data transfer, manipulation of bits (masking, shifting), and called that IDL (for "Interface Description Language").

Some of the bit-manipulation was inspired by the SMS Microcontroller, which addressed memory in bits.

The following year, I gave a short presentation on that at one of the Computer Architecture workshops, though the fact that the interpreter was written in assembly language was a shortcoming.

Larry Tesler happened to be at that meeting, though by then he was interested in things other than pub.

Discussing my presentation with an associate at the time, he commented that there was at that time another “IDL” (a commercial product called “Interactive Data Language” introduced in 1977). The two are unrelated (mine was “Interface Description Language”). On the other hand, my language might have been the first to use the acronym, because I developed it in August 1976 (yes, I am aware of the much later use by Sun in the 1990s—probably unrelated).

repaginator

There were other system programs – including one which was obsolete within a day after I wrote it. In our lab we had no line printer – just printing terminals. The RT-11 text editor (same syntax as teco, but simpler) could be made to repeat commands to read a buffer, add extra lines, delete those past line 58, print the page, delete the page from memory, etc., to print a listing file. The command to do this was most of a line, and annoying to type in. So I wrote a program to do this. While I was at it, I considered it a waste of paper to print only 58 lines (on a 66-line page), so I made the program smart enough to recognize the header lines on a listing and move them, making the real page size larger, e.g., 62 lines.

The lab manager (Gary Leive) found that solution unappealing, and (having the sources for the system), made the terminal driver recognize form-feeds. Just a small change was needed. I discarded the repaginator.

txt – pub clone

During much of 1975-1976, I sublet another researcher's PDP-10 account to write manuals for simic. That featured teco, sos – and pub (a text formatter written by Larry Tesler at Stanford AI Lab around 1973). That came to an abrupt end (it was an informal agreement), and I rescued my files.

But it was a year spent learning to appreciate a nice (if not entirely efficient or consistent) tool. So I walked back to my lab and spent 15 hours writing the first thousand lines of txt. While pub had libraries to format for a wide variety of output devices, txt would "only" go to printing terminals. But it was a workable clone of txt, recognizing all of the formatting controls, implementing macros, footnotes, headers, floating sections, table of contents, etc. Five years later, at ten thousand lines (of assembler, of course), I was done with txt.

At the time, the term "word processing" was not used much. I first encountered it a couple of years later, and was confused at first, supposing it to be related to natural language processing. Instead, we referred to it as "text compiling", which was misleading. (I once overheard a comment about that fellow over in the other building writing compilers, and realized it was about me).

bfor – Fortran preprocessor

In 1976, I started writing a program for my degree which would compute a complexity metric based on a description of a computer's instruction set. Unlike the previous work I'd done as a graduate student, this would be for publication (or review, whatever – I did not know how the process would work at that point). On the minicomputer, I had only three languages to choose from: assembler, Basic or Fortran. The first two were unsuitable for this purpose (perhaps all three, in retrospect). I started writing my program in Fortran.

The minicomputer was small (with more memory, then 32kb), and I found that even a moderate-sized Fortran program would exceed its memory. This was DEC's RT-11 Fortran, which had a few extensions over Fortran 66, The solution was to split the program into overlay sections, which would share memory in COMMON-blocks. Soon, I had more than a dozen overlay sections.

Here, I ran into a new problem: splitting the program up into pieces like that made it less maintainable. Each part of the program would have its own copy of the COMMON-blocks. I spent three days isolating a problem due to an editing error which made one of the parts of the program have a different array size.

However, in my day job (I had one by then, since student grants do not last forever), I had encountered a new variety of Fortran, on Univac 1100.

Univac Fortran had many of the features which later appeared in Fortran 77. Later, Univac's Fortran 77 compiler was known as ASCII Fortran. Of special interest were include and parameter statements. Fortran 77 added the latter.

At the end of 1976, I started writing a Fortran preprocessor bfor, which had these features:

  • include statements
  • parameter statements
  • listing-file, showing both line-numbers and statement numbers.
  • cross-reference (of symbols)

Parsing Fortran properly is difficult: it uses statements rather than tokens. Blanks in the middle of a statement (other than strings, e.g., for formatted I/O) are ignored. These two lines are equivalent:

DOI=10,20
DO I=10,20

Unless you read up to the comma, the first line looks like an assignment statement. But it is not; it is the first statement of a do-loop.

These two lines are also equivalent:

END
E N D

I wrote bfor in assembler, of course, to fit in the small machine.

systat – system statistics

bfor, txt and simic were large chunks of what I was developing. The end-application written in Fortran was larger than any of these.

Just to try to keep track of what I was developing, I wrote a metrics program which showed a report of what was on the disk. The report was broken down by file-type (one or more suffixes per group), and showed the number of files, total file-size and number of lines. Initially I called it lincnt (a six-character name meaning "line count"). Later I renamed it systat (system statistics), to reflect my objective of managing all of the files on my disk.

Later, when I began programming in C (see spasm), I made a simplified version and reused the name lincnt for that.


1977
axis2 – function for Calcomp plotter

At work, the computer center had a Calcomp. plotter. Paper was fed from a long roll; the plotter's pen could go on all directions. But the width of the paper was only a couple of feet.

I used it to plot data for a computer simulation of a three-phase. power system. Divided into three parts (plus margins), that sounds like enough room for ample resolution. However, the Calcomp plotting software's axis function was limited. The axis annotation showed numbers and tic-marks. However, it had no provision for scaling the range shown to make the best use of the space available.

I designed my own axis function which optimized the display of tic-marks and numbers. Each axis would end with a tic-mark. Depending on the range, there might be large tic-marks (1,2,5,10,...) or small tic-marks (1,2,3,4,5,...). The numbers would appear on the large tic-marks. There were also special cases where those rules did not apply, just to make the result look better.

In practice, this was much harder than it sounds, since there is roundoff to consider (5 decimal digits was "good enough") and many special cases. The function was about 300 lines of Fortran, and took many hours to perfect.

serial interfaces...
My research project was in the doldrums for several months due to printer problems. Eventually it was resolved, as detailed here.

1978
dis – pdp11 disassembler

My work on the complexity metric hit a snag: I ran into a mysterious problem with the Fortran compiler. There was no high-level debugger; the only useful tool was to write trace files. However, that gave no insight.

DEC's Fortran compiler generated threaded code. I "knew" that from reading the description. But there was no explanation of what that meant, and the compiler listings just showed something that suggested what the compiled code did. However, something was wrong with it.

The previous year, I'd spent one night disassembling a chunk of the macro assembler, to determine how to patch it to allow comments to be printed in mixed-case. That was manually, starting with an octal dump (in a text-file) and splitting, substituting pieces. I did get to my goal, but found it was hard.

Looking at the amount of code to analyze, I decided to write a disassembler to do the steps I'd done manually. That took about three weeks to write. It sounds like a long time, but it paid off in the first session that I used it. The problem was that the compiler omitted one of the instructions if it happened to be compiling an assignment between variables at the beginning of a COMMON-block.

Seeing that meant that I had to understand what the threaded code meant, and how it was used. There was a substantial amount of code in the thread interpreter to deal with. Just that part was a few thousand instructions.

My PDP-11 disassembler was interesting because it allowed for the machine's variable-length instruction set, and a mixture of inline data and instructions. That's essentially what the threaded code was.

Most disassemblers that I've seen are not interactive. I could assign symbols to memory addresses, and choose how to display the current location, stepping through a sequence until there was a branch instruction. On a branch (or jump), dis could jump to the branch destination, or fall-through to the next instruction. It also had a go-back command.

Incidentally, the last time I used dis was in 1980, to determine how to apply the form-feed modification to a MINC-11 system, driving a Daisy-wheel printer.


1979
equate – plot mathematical formulae on calcomp

With the Calcomp plotter constantly present, something jogged my mind, and I realized that mathematical formulae (as printed in books) were naturally block-structured. Both the final form and a description of them could be expressed as nested blocks.

I designed and implemented an interpreter (in Fortran...) which could read a description, and made the Calcomp plotter show radicals, integrals, summations, fractions, Greek letters (I made my own font, using graph paper).

All very nice, but a digression from power electronics.

software metrics

This was a different topic that I got interested in then, and start writing programs to collect.

clipping for calcomp plotter

One of the pitfalls in modifying the plotting portion of the simulation that I worked on, was if I misjudged the size of the plot. If this happened, then the pen would run up against the side of the drum, and (not allowing for the motion which was not successful when it returned, the plot would be drawn in the wrong place. Seeing the original error was hard, since the result was very confusing.

I decided to fix this by changing the Calcomp plotter.

The computer center had the source for the Calcomp library. But it was provided under a non-disclosure agreement, and the person in charge did not permit me to view it.

Talking to Henry Bowlden, he explained that I could use the Univac linker to rename entrypoints in an object file, and use that to intercept calls to the Calcomp library.

I used that approach to write a wrapper for the Calcomp plotting library, to analyze the parameters of the calls which it made to raise, lower, move the pen.

Then, I devised a replacement for the plotting function, which could limit the coordinates of the pen as it moved around. A few months later, one of my associates pointed out that this sort of thing was already done, e.g., showing me a book by Sutherland which had a simpler algorithm.

floating point interpreter

Meanwhile, my research project was progressing. But I ran into a serious problem: the computer that I was using was beginning to fail. Even "small" computers (then...) required a cool environment, to get rid of waste heat. This was running in a room around 60 degrees Fahrenheit, rather uncomfortable for us humans. Anytime the temperature got comfortable, the machine started to malfunction. Finally, its power supply broke. The department that owned it didn't have funds (and didn't want my help). So I was forced to move.

There was another machine available, in a different department. However, it was not as capable as the one I'd been using. The main deficiency was that it lacked the extended instruction set (e.g., shifting, and floating point operations). The Fortran compiler on the broken computer used those instructions. The Fortran compiler on the working computer was an older revision, with too many bugs to be useful.

I decided to move the better compiler to the working machine by writing a trap emulator for the extended instruction set. Working at my desk at lunchtime, I wrote this in about three weeks. That was slightly over a thousand lines of PDP-11 assembler, and it worked properly when I finished transcribing my notes into the computer. (I did spend some time making a test driver as well).


1980
directory sort, revisited

I spent part of that year working on a project to control a power inverter with an 8086-based single-board computer. This was actually an embedded application: I wrote a monitoring application (displaying on an ADM-3A) to show its status information. The monitor, which ran on the development system, used shared memory to obtain results, polling for keyboard input (to accept commands to show different information). To get reasonably fast updates, I used cursor-addressing.

But the controller was only part of it.

The development system which I used was running ISIS-II, and provided a compiler for PL/M as well as both a native (8080) and cross-assembler. That sounded good at the time.

However (just like RT-11) that's like moving into a house which has a stove and refrigerator and a couple of chairs. If you want it nice, you'll have to add some parts.

I wrote 18 utility programs for that system, of which all but 1-2 were in PL/M. One, for instance, was a program to list (floppy) disk directories sorted by name. The ISIS-II directory utility would only list files in the order they were stored on the disk. With a few dozen files, this was not easy to read the listing. So I wrote a program that would read the directory, sort it, and show the result.

You're thinking: so what?

Well, those computers were slow. It took twenty seconds before my program started showing the directory. The ISIS-II program started almost at once. However, even with that head start, my program finished showing the directory first (something like 35 seconds total, compared to 40 seconds).

EDIT clone

Before those PL/M programs, I wrote my longest 8080 assembly-language program: a text-editor. The motivation for this was that the ISIS-II had two choices for a text editor:

  • a fancy editor (named "CREDIT") which did not work with the ADM-3A terminal (because its cursor movement was by using control characters), and would use almost all of the memory on the development system, and
  • a crude editor, with only eight commands. One of those quit the editor. The other seven are less memorable.

At the beginning of that project, my organization was being reorganized, and I had no direction (sic) for almost a month.

On the other hand, I had an idea, from the style of assembly language programming that I was doing in my research project. I'd found a useful way to embed a jumptable inline in the program. It made decoding and parsing much simpler. That worked nicely with the PDP-11, whose registers could be used as index registers. I'd also experimented with making the skeleton of a text editor in my simulators for the 8080. So I wrote a text editor for the 8080, modeling it after the editor for RT-11 (which in turn was modeled after TECO).

The initial version of the editor was 4Kb, which left most of the 40Kb memory free for holding the file. (This was smaller than Intel's dumb editor, which I assumed must be written in PL/M). Like the PDP-11 editor, mine stored a "page" (up to a form-feed). Eventually, the program grew near 8Kb (still much smaller than CREDIT, which used 32Kb).

I improved on the design, by adding the ability to see some of the editor's state. For instance, a "v" command would show a "^" under the current cursor position. Another command would show the contents of an internal buffer (used for copying text from one point to another).

About twelve years later, I became involved in developing vi like emacs, and set about extending that.

A comment about software patents...

The research center where I worked then was much like any other company. They spent money, and were always looking for sources of revenue. One of the hurdles we had on a yearly basis was a quota of patent disclosures for each department. These were essentially material from which the legal department could select and decide which to use for filing patents.

One of my coworkers pointed to some of the features of the programs which I'd written for the controller and suggested those would be suitable grist for this mill. In particular, there was a feature that I'd developed to achieve a performance goal:

  • the controller was synchronized using a timer that interrupted the 8086 shortly before the point when a control action was needed.
  • the desired timing resolution for switching was 10 microseconds, which is not much larger than the instruction time.
  • no-ops had a much smaller instruction time (2 microseconds).

I came up with a table-scheme which I said was like a fireman's pole, visualizing someone running to the pole, and sliding down (with some horizontal oscillation). By jumping into the table at a given point (computed with fixed timing cost, using the difference between the desired and actual time as a factor), the program would exit from the table at a precise time.

Of course this was software. For a patent, the disclosure had to recast the description as implemented by a machine. I submitted a description for this. (Later, I found that my coworker had also submitted a disclosure for the same "invention").

Actually, I thought the approach was reasonably obvious, and probably not novel enough for a patent. I did not hear anything more about that.

The reason why this is interesting is that it is essentially the same thing later used in C, called Duff's device. According to some sources, there was prior art. My interrupt handler was an example of that.

unover – backspace/underline filters for Univac and PDP-11

I used txt for my graduate research. At that time, I was also using the Univac system at the research center for producing memos and reports. After I had been there about a year, my manager commented that the way to get ahead there was to write reports. They need not be long. In practice, memos would work just as well. The main difference between the two was that memos would be up to ten pages (typically 5), while reports would be longer. I took the hint, and wrote memos and reports. By the time I left the research center, I had written 45 memos and reports – more than one a month.

The completion of my research (and need to publish the results) came around the time that I became involved with DPS. I wrote programs for the Univac system and the PDP-11 (actually MINC-11 at that time) to solve these problems:

  • Getting my memos and reports typed was glacially slow until I started using the doc program on the Univac to make working drafts (on the line printer). After that (probably 1979) I found there were a few high-quality printing terminals, which I could use to print the final version – except for the cover sheet.
  • By itself, txt was not complete. I needed a program to make nice printouts (no DECwriter dot-matrix stuff). Nice printouts were available using AJ-832 (Anderson Jacobsen) daisy wheel terminals, using individual sheets, hand-fed.

That was the initial plan: to make a program that would wait for a carriage return, then print a page (down to the next form feed), and wait for a carriage return.

The cover sheet (a requirement for memos and reports) was a special problem, because the document number had to be printed on the very top of the page. When printing from the Univac, I was unable to print on that line because I had to enter a carriage return to initiate the printing. So I usually relied on the department secretary for cover sheets (but typed my own during her absence).

Almost immediately, I found that there were problems printing underlined text. Like nroff, txt printed underlined and bold text using overstriking. overstriking). On the systems that I used, however:

  • Univac (originally using its bundled document formatter, and later DPS), was limited to printing 255 characters before it forced a carriage return (and line feed) into the output.
  • Printing overstruck text with the AJ-832 was slow. Writing a succession of underline, backspace, character (repeat) was much slower than printing a non-overstruck line.

Solving the first problem took some investigation. I consulted with Henry Bowlden, who pointed out that the APL interpreter was capable of printing overstruck text wider than 80 columns. But he (having access to the Univac source) discovered that Univac handled this by setting a global variable in the terminal driver rather than making the feature usable by ordinary programs. Ultimately, he suggested a solution: use an escape sequence which would stop output to the AJ-832's just before the operating system sent a carriage return. I wrote a program to do this. That worked, but was not very interesting, since it was useful only for the AJ-832.

The second problem was interesting. By altering the output to reduce the number of changes of direction, I could make the AJ-832 print much faster.

Unlike nroff, txt could produce multiple levels of overstriking, corresponding to bold, bolder, etc. On the PDP-11, I wrote a program “unover” to collect each line with its overstrikes into a matrix. The program minimized the number of switches between forward and backward movement of the daisy wheel carriage by changing sequences such as

_<bs>H_<bs>e_<bs>l_<bs>l_<bs>o

to

_____<bs><bs><bs><bs><bs>Hello

The program col (which seems to have been written later), does something like this. Unlike col, my program also handled the multiple levels of overstrikes emitted by txt. On finding an overstruck segment, it would recur looking for overstrikes on top of that. (I have a mental image of the program zig-zagging up and down hills which is hard to depict in HTML).

While I was the only user for the PDP-11 program, the Univac program was useful to other people. I wrote a memo describing it (most of my memos described useful or innovative programs which I had written). While my memo was still in review, one of the computer center staff wrote a guide to using the program, omitting my role. After some discussion, he sent out a correction to the user guide—as a one-line strip of paper citing me as the author.


1981
macros for DPS
DPS (Document Processing System) was one of the features of a tape from the University of Maryland that was installed on the research center's Univac system late in 1980. Here is a page which mentions that program giving a few details.

At the time, I was working on a project which kept me waiting for output for long periods of time (as long as two or three days). The computing center, to allow themselves to use the program, made it free to use for everyone in the research center (an overhead cost). Given that my project's computing bill was comparable to my salary, this was a good deal.

I investigated DPS, and quickly found that it was much more powerful than Univac's bundled document formatter. Among other nice features, it had associative arrays which I used to make macros to generate a keyword index for the research reports that I was writing. I also wrote macros to generate table contents, list of tables, footnotes, etc. DPS had a feature which was like a queue (or FIFO). I used that to store footnotes until they were needed.

At the time, I was also working on a fairly long report (for a different project).

There was no way to make pictures. My reports had pictures on about 10% of the pages. But I wrote macros which printed the captions on the page. One picture which had two parts, e.g., 2.5a and 2.5b, came out on two successive pages on my first version. This incited a loud argument with an associate who said that it was one picture. I pointed out that it was two parts and that I would change it, but found that simply walking away and changing the macro made the argument go away-—like too many arguments no amount of polite discussion had any effect on my associate.

I finished the report, and moved to a different project. While it was in review, it came back for a minor correction and was given to another person who took it to a secretary to apply scissors and glue to modify the first page. We've come a long way since that point, technically speaking.


1982
XEDIT macros
Late in 1981, I was introduced to XEDIT. This was before REXX was widely available; our system never had it installed. Rather, we did have the older EXEC2.

Our system used CMS for text-editing and printing, as a front end for SPF. SPF had its own text editor. At the time, I remarked that

the advantage of SPF was that one could learn everything about it in one day;
the disadvantage of SPF was that one could learn everything about it in one day.

Initially, XEDIT was interesting because it was not part of SPF. It was less structured, and could be used for ad hoc editing.

But XEDIT was programmable. This was interesting, since the IBM 3270 terminals were awkward to use:

  • those are synchronous terminals; you would edit "on" the screen and periodically press "Enter" to send the updated screen to the host.
  • if you inserted text on the screen, the text which was shifted off to the right was lost.
  • program sources were ragged in appearance because there were no tab characters. The editors supported tabbing, but always inserted spaces.

I wrote some macros to do the shifting:

  • first, I wrote a macro to shift a comment and format it (if it happened to wrap to the next line). Comments were as in C, delimited by a "/*" and "*/".
  • later, I wrote a less specific pair of macros which would search left (or right) on the line from the cursor position. The macro shifted the first word to line up with the cursor.

Editing comments was cumbersome in SPF. We used boxed comments, e.g.,

/*************************
 *                       *
 * This is a comment box *
 *                       *
 *************************/

SPF had a text-formatting command which could wrap long lines. But there were problems with it:

  • it had no way to recognize boxed comments; the user had to set wrap-margins to prevent the box from being wrapped with the text.
  • there was no visual feedback to show the margins currently in use.
  • changing the margins was cumbersome.

After trying the feature, finding that occasional errors wiped out hours of work, I went back to XEDIT to make a macro:

  • the macro checked the line where the cursor was.
  • if the current line was the upper-left corner of a boxed comment, the macro removed all of the asterisks except for the upper-left and lower-right corners.
  • if the current line contained a "/*" without other asterisks, the macro drew a comment-box.

XEDIT had no text formatter. I wrote a macro to do this, again using the cursor position as a hint to the macro, telling it the starting line. Later, I rewrote this into an assembler module using XEDIT's feature for loading small programs to process text in a FIFO. Incidentally, CMS (which supported the FIFO/LIFO feature) was developed before Unix. You can think of it as Unix pipes, however. But the LIFO aspect is something extra.

By the time I was done, I had several dozen editing macros. All of the XEDIT macros could be assigned to a function key. But the 3270 terminals had (only) 12 function keys. I solved that problem by making a macro to page through the sets of useful macros.

Aside from a handful of modules written in assembler, all of these macros were written in EXEC2. When I started writing these, we had both EXEC and EXEC2. EXEC2 was nicer – initially because it could hold 255-character words compared to EXEC's 8-character tokens – and about the same level as XEDIT's commands.

add – an adding machine in EXEC2

What made it interesting is that EXEC2 has only integers and strings. Variables held strings, and could be used in integer calculations.

Like the curses-based add, this was screen-oriented. But it only displayed the last four lines entered. Previous lines could not be edited, but could be cancelled.

By the way, the calculator was used for adding expenses. It displayed the results not as an integer (the underlying representation), but in dollars and cents. I used EXEC2's string operations for doing the conversion.

IBM disassembler

I took over maintenance of a set of development utilities written (or acquired) by an associate who moved to a different division. Our development environment used IBM VM/SP CMS for editing, and MVS/TSO for batch compiles. There were no add-on tools for the front-end CMS; it had EXEC, EXEC2, XEDIT – and assembler.

One of the utility programs was just an executable which the other developer happened to have: prtrcvr (print-receiver). This was originally from one of the IBM user group tapes, and those were in the domain of the system support staff. I wanted to make some improvement to the utility – and the keeper of the tape denied the request.

Recalling the PDP-11 disassembler, I made a hexadecimal (text) dump of the utility and started examining it. Then I wrote some macros for XEDIT to select the text at the current position and (using another new program), lookup and disassemble the file, replacing the hexadecimal strings with assembler mnemonics. Piece by piece, I disassembled the 4Kb of the utility, made my improvements.


1983
freec – free-format compare
Pronounced like "freek", of course. At least one of my associates insisted on pronouncing it "free-cee". It stood for "free-format compare".

I wrote this early in 1983. Someone told me about superc (super compare) program which had just been installed on the VM/CMS system. One of the interesting features of superc was that it could be told to ignore specific columns of the files which were compared. I used that feature to make a preprocessor for it which parsed CHILL program text into tokens with line- and column-values, and a postprocessor which reassembled the tokens into a listing of the differences between two program sources. That let it ignore whitespace changes (including split lines).

The current documentation for super compare states that it can now do word comparisons, which sounds like free-format. (It also has parsers for various comment-styles). At the time that I wrote my program, these features were unavailable in superc, and were novel to my associates.

In mid-1983, there was an internal corporate symposium. I sent abstracts for this and a few other programs. The reviewer rejected this one, stating that it had been done before. I have not found an earlier free-format compare. There are later, more well-known examples (spiff in 1988 and wdiff in 1992).

As an aside, I did present at the symposium. It was a discussion of usage monitoring that I had done for a script which made the prtrcvr user-friendly. I had modified the script to send a copy of command-lines and error/success codes to another account, so that I could collect frequency statistics. While I was maintaining that account, one of my coworkers was having trouble remembering to supply the "(" which delimited options from parameters on the CMS command-line. Seeing that he was repeating the same command many times, I sent a message back from the monitoring account telling him to add the "(". Shortly after, he came by my desk, telling me "That's some program you have there".

spasm – meta-assembler
In mid-1983, I transferred to a different department. The requisition which enabled the transfer pointed to another group in the department which was developing a computer chip which would be microcoded. The instruction set itself was not determined until later, but they wanted an assembler to start programming the chip "soon".

The project lead had done some initial investigation (perhaps motivated by E. Skordalakis' recent paper in IEEE Micro) but had found none which appeared satisfactory.

Given that vague need, and my previous work on assemblers, I proposed to construct a program that would handle whatever this group came up with for an instruction set. The program read a description of the instruction set, and interpreted its input according to the description. Nowadays, one might use a tool such as yacc to do this.

I made descriptions and testcases for a half-dozen computer types. The program worked well enough, but ran about ten times more slowly than I had expected.

Like "freec", there was a given acronym to explain it:

system of programs for assembler generation.

One of my co-workers suggested that "asmgen" would be more apt, but no one had a nicer name.

Another co-worker suggested that rather than attempting to construct a parse-tree for different assembler syntax, that I should just use Lisp syntax. He pointed to someone's dissertation as a starting point, in which the author had used Lisp as an assembly language.

This was the first large program (about 20,000 SLOCs, 4 months of work) which I wrote in C. I started writing it on a VMS system with DEC C version 1.0, then updated to version 2.0. After the program was working, I ported it to the department's BSD VAX system.

The group in which I was working had other interests besides software development; they spent their time writing emails discussing a better "design system" (whether for software or some other topic was never elucidated).


1984
font editor
In addition to sending emails, my group printed memos on a recently-acquired Image laser printer. But there was a problem with the memos: they did not use the corporate logo. Lacking that, there was a strong chance that the printer's output would be disqualified for use with memos.

When I heard about that, I discussed it with some of the people who used the printer. It seemed simple (just a few letters). But the available fonts were wrong. The Imagen had only Times-Roman available. The logo used Univers 57, which was noticeably different (a sans-serif font). There were no free fonts available; Univers 57 would have been expensive. However, the Imagen printer could accept a font defined by a text file.

It seemed simple. And my working terminal was the BBN Bitgraph, which was essentially a vt100 with graphical capability. I started by writing a program which could read an Imagen font file, and display its glyphs using curses, e.g., as an array of asterisks and dots. Then I improved it by using the Bitgraph's escape sequences to show the actual character (as bits, of course). Curses does not do graphics, but with some care, it is possible to modify the display in an "unused" area of the screen. After that, I transformed the display-only program into an editor, with straight lines, circle, fill-operation, etc.

Drawing a font with a tool like that is time-consuming. Rather than draw just the characters I needed, I drew the entire font, as Univers 58 (the unitalicized form), and then added a translation feature to the font-editor which would slant the letters.

The task was not complete until I met with the local graphic arts person (the head of the print shop), and got him to agree that the printed logos looked good enough.

describe – remove Scribe markup.
Shortly after, my organization acquired a license for Scribe (since troff was deemed too hard for some staff to use). I made the logo work with that.

Thinking of deroff, I wrote a filter for Scribe, to strip its markup. Someone did comment that Scribe could write to dumb devices, but removing markup was a different thing, showing text that Scribe might discard.

flist – file-list manager for VMS
This is covered in its own page here. In late 1983, I was introduced to dired on a BSD 4.2 system. I modified it, and found that it was not only useful (I had been using flist on IBM VM/CMS for the previous two years), but put the idea into my mind that I could write one too. In the spring of 1984, I was drafted for a new department, focusing on CAD software development. The new department used VMS, Prime and Apollo workstations.

I did most of my work on an Apollo DN300; the VMS and PrimeOS systems were where the larger CAD programs might run.

Imagen previewer
Even after moving to CAD software development, I revisited the Imagen printer. The CAD department bought one, for printing memos and reports. However, the application that I was working on (a VLSI layout editor) needed hardcopy as well, to satisfy its users.

It was relatively simple to render the lines using the Imagen's graphics, but wasted a lot of paper (since the users wanted to select an area on the screen and print that).

I solved the problem by making a graphics program (on the Apollo workstation) which would display the file to be sent to the printer. It allowed the user to zoom in and out (by factors of two) and pan.

Thinking that the raster display would act something like grayscale, I made an elaborate scaling algorithm based on that assumption. But I found that it was not as legible for scaled-down pictures as just filling in a cell where any pixel in the original was filled in.

Scaling by factors of two sounds unambitious. However, my users were very demanding, and would complain about lines misplaced by a single pixel. Numerical rounding was forbidden. In the layout editor, I devised a scaling algorithm which avoided rounding errors, by restricting the possible viewing sizes. In turn, that provoked a lot of discussion.


1985
futran – Fortran porting tool
While most of my CAD work was on the Apollo workstation, there were some associated data-processing applications on the VMS or PrimeOS system. Most of our programs were written in Fortran 77. DEC Fortran supported include, and PrimeOS Fortran did not. That lack made development clumsier than needed. I did not do the ports, but heard one of my coworkers complaining about problems getting our (Apollo-based) Fortran programs to work.

There was a simple solution (about 3 days of work, 2700 lines of code). I wrote a preprocessor (pronounced "foo-tran") in Fortran which would perform the include feature. What was interesting about it was that it did not use an extra file for the included text, but commented/uncommented the included text. Since an included "file" might appear in more than one location, futran would check, report and repair inconsistencies. It used the first occurrence of included text as the master.

Oddly, that short program gained more recognition from my coworkers and management than the work I was doing on flist.

ded
This is covered in its own page here.

1986
wyse50 function keys
I was introduced to the Wyse-50 terminal early in 1986. It was an inexpensive terminal which had features unlike the DEC terminals that I was familiar with:
  • programmable function keys and
  • programmable status-line.

Contrasted with the SVr2 development systems we used for our application, it was rather elegant. I used the feature to store frequently-used commands.

I wrote a program that would read a set of key assignments from a data file, and send the appropriate escape sequences to program the function keys and make the status reflect those settings.

Also (because both unshifted and shifted function keys had separate definitions), I stored frequently-used passwords on the shifted keys. We do not do that now (though ssh private keys are not that different). But it was useful at the time.


1987
m68kdis – Motorola 68k disassembler
At that time, I was working on a networking card, which used a Motorola 68000-series processor. Our project's configuration control was weak. Rather than continue to get meaningless assurances from the person who burned EPROMs that we had built a specific version, I wrote a disassembler so that I could spot-check the result.

The disassembler was about a thousand lines of C. What made it interesting was that we had only limited access to the board's memory. Its application programming interface provided a view of only a dozen or so bytes, given a starting address. The starting address had to be word-aligned as well, due to addressing limitations of the models of the 68000 which we used (some were 68010's for instance, others were not).

ldcmp – loader-compare
The same issues applied to other phases of software development. Most of the program had been developed by other people. We were making improvements (bug-fixes and enhancements, new features).

The original developers were consultants (read as hackers), who were untidy in their habits. Before making serious changes, I preferred to add comments and cleanup the formatting. This phase would supposedly not change the program, but there were no tools which could be used to ensure the absence of change.

The development environment (SVr2) used COFF object files. Their content would include line-numbering and other debugging information. I wrote a utility named ldcmp which read files, and compared the sections for program and data. Debugging information was ignored.


1988
cm_tools – Configuration Managment tools
This is covered in its own page here.
sccs_tools – SCCS tools
This is covered in its own page here.
1989
mdiff – miller/myer diff
I worked with John Chludzinski to develop a set of prototype applications which included text-differencing. We used the algorithm presented in A File Comparison Program, by Webb Miller and Eugene W. Myers published in Software – Practice and Experience, Vol 15 (11, 1025-1040 November 1983). I made this into a library function and in that way reused it in other programs such as a.diff.

As part of that project, we ported the programs to VMS. One of the problems we encountered was the apparent lack of a sed program. John pointed out that DEC provided an analogous program; I countered by noting that the Gould server had sources would might be adaptible. In studying the sources, I noticed two odd features:

  • Gould had routinely applied their copyright notice to every file—even at the end of shell scripts which did not exit before the notice. There must have been some subtle problems due to that treatment.
  • there were few comments (my recollection says none). Perhaps they were stripped—DEC sources distributed on paper tape used rubouts to obscure comments. Other media can be more flexible.
1990
ldcmp – loader-compare, revisited
I revisited ldcmp in a later project. This used custom X widgets to display numeric results from performance analyses.

A separate development group had provided graphical widgets, e.g., for line- and bar-charts. Then, as an afterthought, one of their developers had written a widget to display a table of numbers. It was not in the project plan, was not documented, and had not been reviewed.

I needed the table widget for my application. It was about 5,000 lines of C, and needed work to be usable. The developer had used indentation levels of 1 or 2 characters throughout. So it needed some initial cleanup.

I wrote a new version of ldcmp, for the a.out format used on our workstations. Using this tool, I was able to cleanup the widget without modifying the object code. I have used this approach in other ways (and on systems with more tolerant object formats), e.g., in ansification of C programs.

1991
Ada PITS
Later in the same project, I encountered lex and yacc.

Most of the project used Ada. We developed on VMS (using DEC Ada), and on Sun- and Apollo workstations using Verdix Ada.

Eric Marshall wrote some simple tools (in C of course) to help with productivity. I recall that they did things such as list the package names, list the with'd packages, etc. They were named according to their use, e.g., a.list, to list package names. There were perhaps four of these utilities (he does not recall exactly either), which used the same parser, which he named "Ada PITS":

Ada Programming In The Small

Given that starting point, I extended it by adding five more utility programs and (of course) cleaning up compiler warnings, makefiles, etc.:

a.compare

This compared Ada source files, ignoring whitespace. The program simply reported (like the Unix cmp utility) if the two files were equivalent.

a.count

This counted lines and statements for Ada source files. I made it produce a report much like my C counter.

a.diff

I wrote this last, not being satisfied with a.compare. This was a more ambitious free-format diff of Ada source files, which attempted (like freec) to show differences ignoring both comments and whitespace.

It did work, but I was not satisfied with the way the results were shown. Reconstructing the sources for display in the lex/yacc configuration was much harder than using superc.

a.makegen

This generated a makefile which could be used to build an Ada program.

Verdix Ada's build system (which most of us used) was slow. The workstation took about 3 minutes to build a program. Most of that time was spent analyzing the source files. On the other hand, bypassing the build system and using the compiler directly took one tenth of the time.

My lex/yacc program took only a few seconds to generate a makefile. Even with the performance advantage, my associates did not approve of the approach, saying that it would not work with generics (though I did that), and that there were some types of programs that it would not work with (I did not find any in our project).

a.trim

This trimmed comments from Ada source, which is fairly simple to do. I wrote this first.

I maintained Ada PITS for a few years, before it was taken up by another group and made into a deliverable.


1992
It was around this time that I changed the focus of my non-directed software development. Up til October 1992, I had always been in an environment where my primary tasks were software development (from analysis through testing). The tide went out, so to speak, and there were no more large development efforts in that company.

At that point, I was doing software prototyping and evaluations to support a variety of actitivies. I used my spare time to start helping external developers with tools which would be useful if they resolved code quality issues (i.e., portability and maintainability). By the time I moved on in 1994, I had evaluated and suggested improvements for about 65 programs. The first large program that I became involved with is the text editor vile. My involvement with cproto and mawk also dates from this period. Other programs that I touched then, but have not since include c2man and par.

Not all of the improvements were successful; the GNU maintainers were (aside from one who incorporated my changes without credit) singularly unreceptive. I may elaborate on those in another venue.


1993
diffstat
This is covered in its own page here. I got involved with the tin newsreader because the computer center people could/would not help with a problem with too-long newsgroup names.

1994
Since mid-1994, most of the non-directed programs that I've written are available on the Internet.

As noted in my page on change-logs, I have become the maintainer for some programs where I was not the original developer. In that process, I have also become the principal developer.

There are other programs to which I have contributed changes. For example, I helped with the sharutils program during 1994-1996.

Glossary

There are several terms used in this page, for which I have not found a reliable source.