Copyright 2009,2010 by Thomas E. Dickey

Old, but Interesting Programs

Here is a brief description of some of the more interesting programs that I worked on, long ago. This does not count the programs that I was told to produce - some of those are interesting as well.

The source code is no longer available for these (and as was the practice, was then available only on a limited basis generally with peers). Others have claimed that source-code was just given away in the 1970s and 1980s. On the contrary, all of the vendor-related source code (DEC, Univac, IBM) which I saw was available on a limited basis — non-disclosure. Perhaps others' experience differ, or else they chose to ignore the agreement.

1970
three-dimensional tic-tac-toe.

I wrote this in Fortran on an IBM 1620 computer. It was not the first computer program that I wrote, but it was the first interesting one.

This used a 4x4x4 array.

The computer had only 20,000 characters of memory. I used a lookup table to store the association between positions and the ways to win, i.e., filling in a line of four positions. The combination of the program to fill the table and the game itself was too large to fit in the available memory. So I split it into two parts:

  • initialize the table, and then (keeping the table in memory), transfer control ("chain") to
  • run the game
The first program was generated (about 800 assignments) by a separate program. The game was about 200 statements. So a thousand-line program was too large for that computer.

1975
cross-assemblers and simulators for more than a dozen microprocessors.

These were written in assembler on PDP-11 (running RT-11). The cross assembler was named micro (a twist on macro, the name of DEC's macro assembler), and the simulator was named simodt (from simulated ODT, DEC's "octal debugging tool"). Combined, they were named simic.

The cross assemblers supported macros, conditional assembly, all of the interesting features that DEC's assemblers did. Both were modular, most of the code being in common (about 80% for the simulators and above 95% for the cross-assemblers).

PDP-11's had a 56Kb address space (plus 8Kb for I/O, for 16-bit addresses). That made them cheap research machines. Not all PDP-11's even had 56Kb of core memory (semiconductor memory came later). Programming them mostly in assembler was not unusual.

I redid the macro processing after a while (in 1977 or 1978), when I was working on txt. The initial version stored the parsed data (a series of tokens). However, that proved to be hard to maintain and extend. So I changed this to work purely with character strings. That made it simpler to concatenate names, perform substitutions, etc.


1976
Basic line renumbering.

I do not recall what I named this.

Someone in the lab brought in a book with a StarTrek program (think "minesweeper", on a printing terminal, at 30 characters/second). But the program was too large to run on our machine (in 24Kb). The Basic interpreter could load sections of a program at run-time, e.g., like overlays. To exploit this feature (and squeeze the program into roughly half the space), it was necessary to resequence the overlaid chunks so they reused the same line numbers.

I spent a couple of days getting the program to work – probably the first nontrivial Basic program I wrote. Then I spent a half hour or so playing with the StarTrek program. That was not as interesting, so I put StarTrek away.

dirsrt – directory sort

RT-11 was a single-user system without any frills. Hierarchical directories came later, in other systems. RT-11's directory program simply read the directory blocks and printed their contents. With a half-dozen people sharing a minicomputer, and a thousand files, it is hard to keep track. I wrote an improved directory program dirsrt which made an index to the directory entries and printed them, sorted by any combination of name, extension, size, date. It also had a switch for writing a blank line between groups, e.g., DIRSRT /A/X/J would show files sorted by name within extensions, skipping a line between distinct extensions.

a subdirectory mechanism

As noted, RT-11 had only a flat directory structure. But several researchers shared a few machines, and there was a lot of clutter. I wrote a program which simulated RT-11's directory structure inside a large file.

It was interesting, but did not really solve the problem, since RT-11 did not have a way to create virtual device drivers. I could copy files in/out, and do directory listings; but to use the files, they had to be outside the container.

At the time, an associate commented that the concept was something like interchange-format, a reference to a program which has been obscured by other uses.

Another person said it was like a partitioned dataset.

In either case, I did learn something.

an interface description language

Users of my microprocessor simulators complained that they were missing a useful feature: the ability to do I/O. I designed a simple language to describe ports, data transfer, manipulation of bits (masking, shifting), and called that IDL (for "Interface Description Language").

Some of the bit-manipulation was inspired by the SMS Microcontroller, which addressed memory in bits.

The following year, I gave a short presentation on that at one of the Computer Architecture workshop, though the fact that the interpreter was written in assembly language was a shortcoming.

Larry Tesler happened to be at that meeting, though by then he was interested in things other than pub.

repaginator

There were other system programs – including one which was obsolete within a day after I wrote it. In our lab we had no line printer – just printing terminals. The RT-11 text editor (same syntax as teco, but simpler) could be made to repeat commands to read a buffer, add extra lines, delete those past line 58, print the page, delete the page from memory, etc., to print a listing file. The command to do this was most of a line, and annoying to type in. So I wrote a program to do this. While I was at it, I considered it a waste of paper to print only 58 lines (on a 66-line page), so I made the program smart enough to recognize the header lines on a listing and move them, making the real page size larger, e.g., 62 lines.

The lab manager found that solution unappealing, and (having the sources for the system), made the terminal driver recognize form-feeds. Just a small change was needed. I discarded the repaginator.

txt – pub clone

During much of 1975-1976, I sublet another researcher's PDP-10 account to write manuals for simic. That featured teco, sos – and pub (a text formatter written by Larry Tesler at Stanford AI Lab around 1973). That came to an abrupt end (it was an informal agreement), and I rescued my files.

But it was a year spent learning to appreciate a nice (if not entirely efficient or consistent) tool. So I walked back to my lab and spent 15 hours writing the first thousand lines of txt. While pub had libraries to format for a wide variety of output devices, txt would "only" go to printing terminals. But it was a workable clone of txt, recognizing all of the formatting controls, implementing macros, footnotes, headers, floating sections, table of contents, etc. Five years later, at ten thousand lines (of assembler, of course), I was done with txt.

At the time, the term "word processing" was not used much. I first encountered it a couple of years later, and was confused at first, supposing it to be related to natural language processing. Instead, we referred to it as "text compiling", which was misleading. (I once overheard a comment about that fellow over in the other building writing compilers, and realized it was about me).

bfor – Fortran preprocessor

In 1976, I started writing a program for my degree which would compute a complexity metric based on a description of a computer's instruction set. Unlike the previous work I'd done as a graduate student, this would be for publication (or review, whatever – I did not know how the process would work at that point). On the minicomputer, I had only three languages to choose from: assembler, Basic or Fortran. The first two were unsuitable for this purpose (perhaps all three, in retrospect). I started writing my program in Fortran.

The minicomputer was small (with more memory, then 32kb), and I found that even a moderate-sized Fortran program would exceed its memory. This was DEC's RT-11 Fortran, which had a few extensions over Fortran 66, The solution was to split the program into overlay sections, which would share memory in COMMON-blocks. Soon, I had more than a dozen overlay sections.

Here, I ran into a new problem: splitting the program up into pieces like that made it less maintainable. Each part of the program would have its own copy of the COMMON-blocks. I spent three days isolating a problem due to an editing error which made one of the parts of the program have a different array size.

However, in my day job (I had one by then, since student grants do not last forever), I had encountered a new variety of Fortran, on Univac 1100.

Univac Fortran had many of the features which later appeared in Fortran 77. Later, Univac's Fortran 77 compiler was known as ASCII Fortran. Of special interest were include and parameter statements. Fortran 77 added the latter.

At the end of 1976, I started writing a Fortran preprocessor bfor, which had these features:

  • include statements
  • parameter statements
  • listing-file, showing both line-numbers and statement numbers.
  • cross-reference (of symbols)

Parsing Fortran properly is difficult: it uses statements rather than tokens. Blanks in the middle of a statement (other than strings, e.g., for formatted I/O) are ignored. These two lines are equivalent:


DOI=10,20
DO I=10,20
Unless you read up to the comma, the first line looks like an assignment statement. But it is not; it is the first statement of a do-loop.

These two lines are also equivalent:


END
E N D

I wrote bfor in assembler, of course, to fit in the small machine.


1977
axis function for calcomp plotter

At work, the computer center had a Calcomp. plotter. Paper was fed from a long roll; the plotter's pen could go on all directions. But the width of the paper was only a couple of feet.

I used it to plot data for a computer simulation of a three-phase. power system. Divided into three parts (plus margins), that sounds like enough room for ample resolution. However, the Calcomp plotting software's axis function was limited. The axis annotation showed numbers and tic-marks. However, it had no provision for scaling the range shown to make the best use of the space available.

I designed my own axis function which optimized the display of tic-marks and numbers. Each axis would end with a tic-mark. Depending on the range, there might be large tic-marks (1,2,5,10,...) or small tic-marks (1,2,3,4,5,...). The numbers would appear on the large tic-marks. There were also special cases where those rules did not apply, just to make the result look better.

In practice, this was much harder than it sounds, since there is roundoff to consider (5 decimal digits was "good enough") and many special cases. The function was about 300 lines of Fortran, and took many hours to perfect.


1978
dis – pdp11 disassembler

My work on the complexity metric hit a snag: I ran into a mysterious problem with the Fortran compiler. There was no high-level debugger; the only useful tool was to write trace files. However, that gave no insight.

DEC's Fortran compiler generated threaded code. I "knew" that from reading the description. But there was no explanation of what that meant, and the compiler listings just showed something that suggested what the compiled code did. However, something was wrong with it.

The previous year, I'd spent one night disassembling a chunk of the macro assembler, to determine how to patch it to allow comments to be printed in mixed-case. That was manually, starting with an octal dump (in a text-file) and splitting, substituting pieces. I did get to my goal, but found it was hard.

Looking at the amount of code to analyze, I decided to write a disassembler to do the steps I'd done manually. That took about three weeks to write. It sounds like a long time, but it paid off in the first session that I used it. The problem was that the compiler omitted one of the instructions if it happened to be compiling an assignment between variables at the beginning of a COMMON-block.

Seeing that meant that I had to understand what the threaded code meant, and how it was used. There was a substantial amount of code in the thread interpreter to deal with. Just that part was a few thousand instructions.

My PDP-11 disassembler was interesting because it allowed for the machine's variable-length instruction set, and a mixture of inline data and instructions. That's essentially what the threaded code was.

Most disassemblers that I've seen are not interactive. I could assign symbols to memory addresses, and choose how to display the current location, stepping through a sequence until there was a branch instruction. On a branch (or jump), dis could jump to the branch destination, or fall-through to the next instruction. It also had a go-back command.


1979
equate – plot mathematical formulae on calcomp

With the Calcomp plotter constantly present, something jogged my mind, and I realized that mathematical formulae (as printed in books) were naturally block-structured. Both the final form and a description of them could be expressed as nested blocks.

I designed and implemented an interpreter (in Fortran...) which could read a description, and made the Calcomp plotter show radicals, integrals, summations, fractions, Greek letters (I made my own font, using graph paper).

All very nice, but a digression from power electronics.

software metrics

This was a different topic that I got interested in then, and start writing programs to collect.

clipping for calcomp plotter

One of the pitfalls in modifying the plotting portion of the simulation that I worked on, was if I misjudged the size of the plot. If this happened, then the pen would run up against the side of the drum, and (not allowing for the motion which was not successful when it returned, the plot would be drawn in the wrong place. Seeing the original error was hard, since the result was very confusing.

I decided to fix this by changing the Calcomp plotter.

The computer center had the source for the Calcomp library. But it was provided under a non-disclosure agreement, and the person in charge did not permit me to view it.

Talking to a different person, he explained that I could use the Univac linker to rename entrypoints in an object file, and use that to intercept calls to the Calcomp library.

I used that approach to write a wrapper for the Calcomp plotting library, to analyze the parameters of the calls which it made to raise, lower, move the pen.

Then, I devised a replacement for the plotting function, which could limit the coordinates of the pen as it moved around. A few months later, one of my associates pointed out that this sort of thing was already done, e.g., showing me a book by Sutherland which had a simpler algorithm.

floating point interpreter

Meanwhile, my research project was progressing. But I ran into a serious problem: the computer that I was using was beginning to fail. Even "small" computers (then...) required a cool environment, to get rid of waste heat. This was running in a room around 60 degrees Fahreheit, rather uncomfortable for us humans. Anytime the temperature got comfortable, the machine started to malfunction. Finally, its power supply broke. The department that owned it didn't have funds (and didn't want my help). So I was forced to move.

There was another machine available, in a different department. However, it was not as capable as the one I'd been using. The main deficiency was that it lacked the extended instruction set (e.g., shifting, and floating point operations). The Fortran compiler on the broken computer used those instructions. The Fortran compiler on the working computer was an older revision, with too many bugs to be useful.

I decided to move the better compiler to the working machine by writing a trap emulator for the extended instruction set. Working at my desk at lunchtime, I wrote this in about three weeks. That was slightly over a thousand lines of PDP-11 assembler, and it worked properly when I finished transcribing my notes into the computer. (I did spend some time making a test driver as well).


1980
directory sort, revisited

I spent part of that year working on a project to control a power inverter with an 8086-based single-board computer. This was actually an embedded application, but I wrote a monitoring application (displaying on an ADM-3A) to show its status information. But the controller was only part of it.

The development system which I used was running ISIS-II, and provided a compiler for PL/M as well as both a native (8080) and cross-assembler. That sounded good at the time.

However (just like RT-11) that's like moving into a house which has a stove and refrigerator and a couple of chairs. If you want it nice, you'll have to add some parts.

I wrote 18 utility programs for that system, of which all but 1-2 were in PL/M. One, for instance, was a program to list (floppy) disk directories sorted by name. The ISIS-II directory utility would only list files in the order they were stored on the disk. With a few dozen files, this was not easy to read the listing. So I wrote a program that would read the directory, sort it, and show the result.

You're thinking: so what?

Well, those computers were slow. It took twenty seconds before my program started showing the directory. The ISIS-II program started almost at once. However, even with that head start, my program finished showing the directory first (something like 35 seconds total, compared to 40 seconds).

edit clone

Before those PL/M programs, I wrote my longest 8080 assembly-language program: a text-editor. The motivation for this was that the ISIS-II had two choices for a text editor:

  • a fancy editor (named "CREDIT") which did not work with the ADM-3A terminal (because its cursor movement was by using control characters), and would use almost all of the memory on the development system, and
  • a crude editor, with only eight commands. One of those quit the editor. The other seven are less memorable.
At the beginning of that project, my organization was being reorganized, and I had no direction (sic) for almost a month.

On the other hand, I had an idea, from the style of assembly language programming that I was doing in my research project. I'd found a useful way to embed a jumptable inline in the program. It made decoding and parsing much simpler. That worked nicely with the PDP-11, whose registers could be used as index registers. I'd also experimented with making the skeleton of a text editor in my simulators for the 8080. So I wrote a text editor for the 8080, modeling it after the editor for RT-11 (which in turn was modeled after TECO).

The initial version of the editor was 4Kb, which left most of the 40Kb memory free for holding the file. (This was smaller than Intel's dumb editor, which I assumed must be written in PL/M). Like the PDP-11 editor, mine stored a "page" (up to a form-feed). Eventually, the program grew near 8Kb (still much smaller than CREDIT, which used 32Kb).

I improved on the design, by adding the ability to see some of the editor's state. For instance, a "v" command would show a "^" under the current cursor position. Another command would show the contents of an internal buffer (used for copying text from one point to another).

About twelve years later, I became involved in developing vi like emacs, and set about extending that.

A comment about software patents...

The research center where I worked then was much like any other company. They spent money, and were always looking for sources of revenue. One of the hurdles we had on a yearly basis was a quota of patent disclosures for each department. These were essentially material from which the legal department could select and decide which to use for filing patents.

One of my coworkers pointed to some of the features of the programs which I'd written for the controller and suggested those would be suitable grist for this mill. In particular, there was a feature that I'd developed to achieve a performance goal:

  • the controller was synchronized using a timer that interrupted the 8086 shortly before the point when a control action was needed.
  • the desired timing resolution for switching was 10 microseconds, which is not much larger than the instruction time.
  • no-ops had a much smaller instruction time (2 microseconds).

I came up with a table-scheme which I said was like a fireman's pole, visualizing someone running to the pole, and sliding down (with some horizontal oscillation). By jumping into the table at a given point (computed with fixed timing cost, using the difference between the desired and actual time as a factor), the program would exit from the table at a precise time.

Of course this was software. For a patent, the disclosure had to recast the description as implemented by a machine. I submitted a description for this. (Later, I found that my coworker had also submitted a disclosure for the same "invention").

Actually, I thought the approach was reasonably obvious, and probably not novel enough for a patent. I did not hear anything more about that.

The reason why this is interesting is that it is essentially the same thing later used in C, called Duff's device. According to some sources, there was prior art. My interrupt handler was an example of that.

backspace/underline filters for Univac and PDP-11
By itself, txt was not complete. I needed a program to make nice printouts (no DecWriter dot-matrix stuff). Nice printouts were available using AJ-832 (Anderson Jacobsen) daisy wheel terminals, using individual sheets, hand-fed.

That was the initial plan: to make a program that would wait for a carriage return, then print a page (down to the next form feed), and wait for a carriage return.

Almost immediately, I found that there were problems printing underlined text. Like nroff, txt printed underlined and bold text using overstriking. overstriking). On the systems that I used, however:

  • Univac (originally using its bundled document formatter, and later DPS), was limited to printing 255 characters before it forced a carriage return (and line feed) into the output.
  • Printing overstruck text with the AJ-832 was slow. Writing a succession of underline, backspace, character (repeat) was much slower than printing a non-overstruck line.

Solving the first problem took some investigation. I consulted with an associate, who pointed out that the APL interpreter was capable of printing overstruck text wider than 80 columns. But he (having access to the Univac source) discovered that Univac handled this by setting a global variable in the terminal driver rather than making the feature usable by ordinary programs. Ultimately, he suggested a solution: use an escape sequence which would stop output to the AJ-832's just before the operating system sent a carriage return. I wrote a program to do this. That worked, but was not very interesting, since it was useful only for the AJ-832.

The second problem was interesting. Unlike nroff, txt could produce multiple levels of overstriking, corresponding to bold, bolder, etc. On the PDP-11, I wrote a program to collect each line with its overstrikes into a matrix. The program minimized the number of switches between forward and backward movement of the daisy wheel carriage by changing sequences such as

_<bs>H_<bs>e_<bs>l_<bs>l_<bs>o
to
_____<bs><bs><bs><bs><bs>Hello

The program col (which seems to have been written later), does something like this. Unlike col, my program also handled the multiple levels of overstrikes emitted by txt. On finding an overstruck segment, it would recur looking for overstrikes on top of that. (I have a mental image of the program zig-zagging up and down hills which is hard to depict in HTML).


1981
macros for DPS
DPS (Document Processing System) was one of the features of a tape from the University of Maryland that was installed on the research center's Univac system late in 1980. Here is a page which mentions that program giving a few details.

At the time, I was working on a project which kept me waiting for output for long periods of time (as long as two or three days). The computing center, to allow themselves to use the program, make it free to use for everyone in the research center. Given that my project's computing bill was comparable to my salary, this was a good deal.

I investigated DPS, and quickly found that it was much more powerful than Univac's bundled document formatter. Among other nice features, it had associative arrays which I used to make macros to generate a keyword index for the research reports that I was writing. I also wrote macros to generate table contents, list of tables, footnotes, etc. DPS had a feature which was like a queue (or FIFO). I used that to store footnotes until they were needed.

At the time, I was working on a fairly long report.

There was no way to make pictures. My reports had pictures on about 10% of the pages. But I wrote macros which printed the captions on the page. One picture which had two parts, e.g., 2.5a and 2.5b, came out on two successive pages on my first version. This incited a loud argument with an associate who said that it was one picture. I pointed out that it was two parts and that I would change it, but found that simply walking away and changing the macro made the argument go away-—like too many arguments no amount of polite discussion had any effect on my associate.

I finished the report, and moved to a different project. While it was in review, it came back for a minor correction and was given to another person who took it to a secretary to apply scissors and glue to modify the first page. We've come a long way since that point, technically speaking.


1982
editing macros
Late in 1981, I was introduced to xedit). This was before rexx) was widely available; our system never had it installed. Rather, we did have the older exec2).

Our system used CMS for text-editing and printing, as a front end for SPF). SPF had its own text editor. At the time, I remarked that

the advantage of SPF was that one could learn everything about it in one day;
the disadvantage of SPF was that one could learn everything about it in one day.
xedit macros for comment block/unblock/align as well as general align, text-justification and function-key banks exec2 was nicer – initially because it could hold 255-character words compared to exec's 8-character tokens – and about the same level as xedit's commands.
add – an adding machine in EXEC2

What made it interesting is that EXEC2 has only integers and strings. Variables held strings, and could be used in integer calculations.

ibm disassembler

I took over maintenance of a set of development utilities written (or acquired) by an associate who moved to a different division. Our development environment used IBM VM/SP CMS for editing, and MVS/TSO for batch compiles. There were no add-on tools for the front-end CMS; it had EXEC, EXEC2, XEDIT – and assembler.

One of the utility programs was just an executable which the other developer happened to have: prtrcvr (print-receiver). This was originally from one of the IBM user group tapes, and those were in the domain of the system support staff. I wanted to make some improvement to the utility – and the keeper of the tape denied the request.

Recalling the PDP-11 disassembler, I made a hexadecimal (text) dump of the utility and started examining it. Then I wrote some macros for XEDIT to select the text at the current position and (using another new program), lookup and disassemble the file, replacing the hexadecimal strings with assembler mnemonics. Piece by piece, I disassembled the 4Kb of the utility, made my improvements.


1983
freec – free-format compare
Pronounced like "freek", of course. At least one of my associates insisted on pronouncing it "free-cee".

I wrote this early in 1983. Someone told me about superc (super compare) program which had just been installed on the VM/CMS system. One of the interesting features of superc was that it could be told to ignore specific columns of the files which were compared. I used that feature to make a preprocessor for it which parsed CHILL program text into tokens with line- and column-values, and a postprocessor which reassembled the tokens into a listing of the differences between two program sources. That let it ignore whitespace changes (including split lines).

The current documentation for super compare states that it can now do word comparisons, which sounds like free-format. (It also has parsers for various comment-styles). At the time that I wrote my program, these features were unavailable in superc, and were novel to my associates.

In mid-1983, there was an internal corporate symposium. I sent abstracts for this and a few other programs. The reviewer rejected this one, stating that it had been done before. I have not found an earlier free-format compare. There are later, more well-known examples (spiff in 1988 and wdiff in 1992).

As an aside, I did present at the symposium. It was a discussion of usage monitoring that I had done for a script which made the prtrcvr user-friendly. I had modified the script to send a copy of command-lines and error/success codes to another account, so that I could collect frequency statistics. While I was maintaining that account, one of my coworkers was having trouble remembering to supply the "(" which delimited options from parameters on the CMS command-line. Seeing that he was repeating the same command many times, I sent a message back from the monitoring account telling him to add the "(". Shortly after, he came by my desk, telling me "That's some program you have there".

spasm – meta-assembler

1984
flist – file-list manager for VMS
This is covered in its own page here. In late 1983, I was introduced to dired on a BSD 4.2 system. I modified it, and found that it was not only useful (I had been using flist on IBM VM/CMS for the previous two years), but put the idea into my mind that I could write one too. In the spring of 1984, I was drafted for a new department, focusing on CAD software development. The new department used VMS, Prime and Apollo workstations.
font editor
imagen previewer
futran – Fortran porting tool
(pronounced "foo-tran")

1985
ded
This is covered in its own page here.

1986
wyse50 function keys

1987
m68k disassembler
ldcmp

1991
Ada PITS
make-generator trim comments compare ada source diff ada source
1992
odd...
1993
diffstat
This is covered in its own page here. I got involved with the tin newsreader because the computer center people could/would not help with a problem with too-long newsgroup names.
1994
Since mid-1994, most of the non-directed programs that I've written are available on the Internet.