Overview of pat2sf.c Source Code

David S. Lawyer

Nov. 1997

1. INTRODUCTION

The main program code for BitFontEdit is pat2sf. This a program which creates bit-mapped soft-fonts for terminals and printers (not yet working for printers). It is written in C and works for WYSE and VT220 type terminals (and a few others which can emulate these). For printers, only some code for HP PCL is included, but it's not finished and can't be used. To use this program one "draws" characters in an ordinary ASCII file using *'s for pixels. Any editor or word processor may be used for this purpose. This file with all the *'s (pixels) in it is called the "pattern" file. Then the program pat2sf (=PATern to SoftFont) scans this file and generates the soft-font code needed for downloading to a terminal or printer. The source code files used to generate pat2sf currently are: scan.c, encode.c, and pat2sf.h. When (and if) printer support is added there will likely be more files.

There are various "languages" used by printers and terminals. Examples for printers are Esc/P2 for Epson Printers (and compatibles) and PCL5 for HP Printers (and compatibles). Examples for terminals are WYSE and VT200. Each language has it own method of encoding. See the file font_langs.html.

My program operates on the pattern file in two stages. The first stage scans this file and puts the *'s (pixels) from this file into matrices in the computer memory. This is done by the code in the source file: scan.c. The second stage is to inspect these matrices and encode the *'s of the matrices (using the the appropriate language such as WYSE) into softfont. The is done by C functions in the file: encode.c. pat2sf.h is a header file for defining data types and symbolic constants, especially those common to both source code files. In order understand the rest of this document you should have at least an elementary knowledge of the C programming language.

The source code contains copious comments and header bars dividing the code up into short sections of various lengths. Each header bar has a title in it such as "parse & print header lines". You probably should study the source code as you read this document document. Header lines will be put in quotes in this document.

Since my font program is not expected to be used by a large number of people, it is not worthwhile to provide it with an "idiot proof" user interface or an extensive help system. Thus it uses a simple command line interface with some interactive dialog in case of certain errors.

This document is often not changed when changes are made to the source code so sometimes you may see obsolete variable names, etc Once changes pile up (every year or so?) it's hopefully revised.

2. SCAN.C main()

First let's examine the scan.c file. This contains the main() function which gets things going, the fill_band() function which reads the pixels and puts them into matrices, and a number of functions to report errors in the layout of the pattern file. It then calls on functions in encode.c to do the actual encoding. You might think that only a few pages of source code are required for scan.c. This might be possible if one were to omit extensive documentation and error checking. The actual size of scan.c is now about 13 pages of paper and growing.

Before main() starts about 2 pages (100 lines) of code are used for declaring global and static variables, and also for forward declarations (function prototypes) of functions. Next comes main() starting off with a half-page of local (automatic) variables used in main().

The program keeps track of which line number (line_no) of the pattern file it is scanning. It does this by the foffset[] vector which maps line numbers to location in the file. At various place in scan.c you will see line_no being bumped and the foffset[] being set by calls to ftell() which returns (tells) the current location in the file. The location is needed in err_show_and_tell() where lines from the pattern file that are incorrectly formated (err => error) are displayed to the user.

The first task of main() is to parse the command line using the C getopt_long() function. It doesn't get all the options however, but saves the options specific to the particular device language for parsing later by functions in encode.c. These device-specific options are placed after the name of the pattern_file on the command line while the general options are placed prior to the pattern_file. Actually, the device-specific options just remain where they were (pointed to by argv[]) and the optind (comes with the getopt_long() function) index is set to point to them so that getopt_long() may be called again from encode.c and it will resume parsing the command line at the location where it left off.

Then the options are examined. One case is "caseH (help)". If one asks for help or wants to display the version no., no attempt should be made to run the program. This is tested by: if i == 0 AOK. If no lang parameter was given to help display usage & exit. If the lang == "list" display the list of Langs by calling ListLangs() in encode.c. "get & test lang" gets a language from the command line but exits if it is null.

There are a number of function pointers in scan.c the most important of which encodes a character matrix of *'s to soft-font. It's named encodeChP(). These functions are for any font language and must be set "set funct ptrs" to point to the language-specific function defined in encode.c. This is done by passing the addresses of the function pointers to Set_Fptrs() found in encode.c. This function put the address of the actual functions to be used into the function pointers of scan.c. If the language is invalid (or perhaps out of place on the command line) Set_Fptrs() will catch it and exit.

Next the options specific to the particular font language (from the command line) are read, set and returned (to the string lang_opts) by calling encodeChP() with the flag SETOPTS. The options set are static variables in encodeChP() which remain until it is called again to do the actual encoding. We can't call on it to encode now since the pattern file has not been scanned yet. It would be pointless to do the work of scanning the pattern file first only to find that either the language was invalid or that the language specific options were invalid.

Next (unless we are using standard input) we "open pattern file" and look for "garbage at end of cmd line". Some input data is provided from the pattern_file besides just the pixels. The Cell_Height, Cell_Width and Chars_Per_Band must be specified by numbers typed into the pattern file. The Chars_Per_Band are the number of character matrices which are in a single row in the pattern file (often 8 for low resolution terminal font). In "parse & print header lines" the above information in header_1 is obtained. header_2 is a comment which is read and echoed to the screen. For details, see the manual section: "pattern input file format".

In "pack & print Font_info", the info from header_1 and the Pad_level (number of nulls to put into each line of soft-font --for dumb terminals only) is and saved for future use in the string Font_info. Also header_1 and header_2 is put into Font_info. Font_info is sent to the screen at the start of running pat2sf, at the end of running pat2sf, and often added as a comment to the soft-font itself in order to help identify the soft-font in case other documentation or notes about the soft-font is missing.

For dumb terminals, these 3 strings are simply appended to the softfont. Downloading the softfont also downloads (hopelessly) harmless comment-garbage at the end of it. All softfont languages should provide for comments.

In "check for bad size data", the 3 pieces of information obtained from the pattern_file is checked to make sure it is within reasonable bounds. Symbolic constants such as CELL_HEIGHT_MAX are found in pat2sf.h and may be changed to permit larger size characters. At present, array size is determined by these constants and a future project is to make array size dynamic so one will not need to recompile the code to use large size characters. However ANSI C doesn't support dynamic arrays (but gnu C does).

Two types of blank (background) pixels are allowed: dots and spaces. Spaces look better because they let one only see the * pixels. Dots indicate where the empty pixels reside. A pattern_file may only use one type of blank pixel. "get blank_pixel type" probes the pattern file to determine the type.

"fill & encode band" calls encodeChP() to encode each character. First the header for each band (in the pattern file) is read and echoed. Then function fill_band() is called to scan a band in the pattern file and put it into the band[][][] array. For each band we cycle thru the characters (Char_no) in that band giving encodeChP() a Char_matrix (presently passed not as a par. but as a global pointer) of *'s for each character. If a bounding box is needed for the character (used in X-windows) get_BB() is called.

The final section "print ending information" calls another function pointer defined in encode.c, addEndP(Font_info), to append ending code to the soft-font. For some printers, there will also be beginning info but that is not in the code yet. Font_info may be added to the soft-font as a comment. For some terminals it's just added raw to the end of the code and may get displayed of the screen when the soft-font is downloaded. Font_info is also printed to the display again.

3. SCAN.C functions

Besides main() there are a number of other functions in scan.c. The function that actually scans the patterns in the pattern file and puts them into the band[][][] array is fill_band(). get_BB() gets the "Bounding Box" by inspection of the Char_matrix slices of the band[][][] array.

There are also several error functions: ..._err. They are called by fill_band() is an error is detected in the format of the pattern file. Errors are present if fscanf() returns 0 or -1 or if the number of pixels gotten in a scan (with fscanf()) is not equal to the character width. Almost every error function calls two other general error functions: 1. err_show_and_tell() which displays the offending lines on the display and tells something about them 2. err_ask() which asks the user if s/he wants to continue looking for more errors, wants help, or wants to quit.

The usage() function shows how to run the program. It shows how to type the command line in the correct format and how to ask for help on various topics. Each language has its own usage function in encode.c and this information (for the selected language obtained via usageP() ) is appended to the general usage information in scan.c.

4. ENCODE.C

To add another language (e.g add more printers) one only needs to add more code (and modify some of the existing code) in file encode.c. This file actually does more than just encode soft-font for various languages. It provides language-specific help, scans the command line, and sets up function pointers. For each type of font there are 4 functions, all grouped together for ease in maintenance. These functions are 1. an encode function which does 2 major tasks: a. it encodes a single character. 2. a get options (getopt) function : encode(,,GETOPT) which scans the command line for parameters specific to this particular language. 3. an add_ending function endFont() which adds a trailer to the end of the softfont code. 4. a size check function SizeChk() to see if enough storage is available in this program to store the font bit-maps. 5. a usage() message to show what special parameters need to be given on the command line for this language.

Note that functions 1. and 2. are actually the same functions called with different flags (ENCODE or GETOPT). This is because the GETOPT part of the code gets options (parameters) from the command line (typed by the user) which will be needed when in the ENCODE part of the function is called. These parameters are made function static variables. They are set when the encode(,,GETOPT) call is made and used in future calls with the encode(,,ENCODE) call. Thus these parameters do not need to be passed to the function, they are "read" by the function and stored inside the function as static variables which remain there for future use.

To enable the encode(argc, argv, GETOPT) function to read (or scan) the command line, it is passed the pointer argv as well as the argument count argc which represent the entire command line. Finding where the language specific options start is easy since optind should point to the first one. The program getopt_long() is used for this scanning. This is a free-ware program which is may not be available in older versions of C compilers for MS DOS (such as Turbo C++). The main() program also scans the command line for user supplied parameters common to all font types: 1. The name of the pattern file to open 2. Should null padding be added to the softfont and if so, how much? 3. Should the pattern file be read as standard input ?

Since there would be a name conflict is say all encode() functions for various types of font had the same name (in the same file) they are given prefixes. For example font for Wyse terminals is encoded using WyseEncode(). When this function is called from the main() program for any type of font, the call is always the same: (* encodeChP) (argc, argv, ...). encodeChP stands for "Encode-Ch-Pointer" If we were encoding font for a Wyse terminal, encodeChP would point to WyseEncode. (* encodeChP) is the actual function (dereferenced).

The situation is similar for all of the above listed 5 functions (actually 4 functions since encode() incorporates two functions in one. Now before one may use these pointer-to-functions in the main() code, these pointers must be assigned to point to the correct functions in encode.c. This is done by calling from main() to a function in encode.c: set_Fptrs (Set Function Pointers) to set up these pointers. In the call the addresses of pointers such as encodeChP are passed as parameters and set_Fptrs puts the correct pointer into these address. Thus the parameter is actually a pointer to a pointer. For the encode function the declaration of this parameter is: char * (*(*encodeCh))(). This "simply" says that encodeCh is a pointer to a pointer to a function that returns a string (= a pointer to a character). Note that the actual encode() function takes 3 parameters but they are not shown here. One says that encodeF is a pointer to a pointer-to-function; such function returning a string pointer (a pointer to a char).

The main() program sends the name for the device language to set_Fptrs so that set_Fptrs knows how to make the function pointer assignments. This language name should be given by the user on the command line. But what if someone doesn't know the name? By typing just "pat2sf" with no parameters a usage message is display which shows how to ask for a list of such codes. --help list will show a list of the languages supported. ---help wyse will display the usage() message for Wyse font. It is may not be a great deal of help but it is better than nothing.

Variables common to the encode.c file are divided into two categories: file variables valid only in this file (static) and global variables which represent data supplied to this file from the main file. The use of function pointers and information hiding using static file variables is sort of an object-oriented approach. The 5 functions for a particular type of font and the file-static variables used by these functions are nearly equivalent to an object in C++.