Command-line arguments are processed in strict order, so that options interspersed between file names affect only those files which follow.
tgrind is not a prettyprinter: all line breaks and horizontal spacing in the input source files are preserved. Prettyprinters require more sophistication, and language-specific knowledge, than is possessed by tgrind. Consequently, you may find it useful to apply a prettyprinter to your source code before giving it to tgrind. Some of the available ones are: bibclean(1) for BibTeX, cb(1) and indent(1) for C and C++, pretty(1) for Fortran, sf3pretty(1) for Fortran and Sftran3, indent-sexp for GNU Emacs Lisp, mft(1) for Metafont, and pindent(1) for Pascal.
In regular mode tgrind processes its input file(s) and passes them to tex(1) for formatting and output, and then sends the output to a DVI driver for conversion to \*(Ps. All output files are normally left in the /tmp directory with numeric base names; the -o option (see below) can change this behavior. tgrind will not overwrite existing files in that directory.
In format mode (i.e., when the -f flag is used), tgrind processes its input file(s) and writes the result to standard output. This output can be saved for later editing, inclusion in a larger document, etc.
The options are:
- AvantGarde (pag),
- Bookman (pbk),
- Charter (bch),
- Courier (pcr),
- Helvetica (phv),
- HelveticaNarrow (phn),
- NewCentury or NewCenturySchoolbook (pnc),
- Palatino (ppl),
- Times (ptm), and
- Utopia (put).
All of these fonts, except Charter and Utopia, and sometimes, HelveticaNarrow, are resident in standard \*(Ps laser printers. The exceptional fonts will be included in the output \*(Ps by the DVI driver program, if the driver's psfonts.map file correctly identifies them as non-resident fonts.
When this option is provided, the \*(Ps output will also use Courier for a typewriter font, and Symbol for certain special characters; both of these fonts are printer resident.
- a68
- Motorola 68xxx assembly language.
- ada
- Ada.
- asm68
- Another Motorola 68xxx assembly language.
- awk
- awk(1), gawk(1), and nawk(1).
- bash
- GNU Bourne-Again shell (bash(1)).
- bibtex
- BibTeX (bibtex(1)).
- c
- C (the default language).
- caml
- CAML.
- c++
- C++ and Objective C.
- csh
- C shell (csh(1)).
- Elisp
- GNU Emacs Lisp. Keywords are considered to be all of those low-level Lisp functions that are implemented in the Lisp interpreter itself (in the C programming language); higher-level Lisp functions written in Lisp are not keywords.
- f
- Fortran.
- i
- ISP.
- I
- Icon.
- ksh
- Korn shell (ksh(1)).
- latex
- LaTeX 2.09. Keywords are considered to be all of the control sequences named in the index of the first edition of Leslie Lamport, LaTeX User's Guide and Reference Manual, Addison-Wesley (1985), ISBN 0-201-15790-X. As with Emacs Lisp and other extensible languages, it seems reasonable to distinguish built-in `system' commands from `user' commands.
- maple
- Maple V. Keywords include the language keywords, operators, constants, and standard global variables.
- maplex
- Extended Maple V. The keywords also include all of the initially-loaded library functions.
- matlab
- Matlab.
- m
- MODEL.
- m2
- Modula-2.
- Miranda
- Miranda.
- ml
- MLisp and Emacs Lisp.
- objc
- Objective C.
- p
- Pascal.
- prolog
- Prolog.
- ps
- PostScript.
- r
- Ratfor.
- russell
- Russell.
- sh
- Bourne shell (sh(1)).
- sf3
- Sftran3.
- src
- Unknown source code (no keywords, comments, or strings are recognized).
- tcsh
- Extended C shell (tcsh(1)).
- tex
- TeX.
- y
- yacc.
The marginal-function-name mechanism depends on the quality of the language description in vgrindefs. The distributed vgrindefs file fails to recognize many legal C function declarations.
Arbitrary formatting styles for programs mostly look bad. The use of spaces to align source code often fails miserably (because of the variable width output font). If you plan to tgrind your program, try to use tabs.
The -f flag means different things to tgrind and vgrind(1).
tgrind is a UNIX csh(1) script that handles argument parsing, and invocation of the preprocessor, indexer, TeX, and a DVI driver. It should be possible to reimplement this script in other operating systems, if they have a reasonably powerful shell command language.
The indexing program, tgrindex.awk, is written in nawk(1), and can be readily handled by GNU gawk(1) as well. Commercial and freely-distributable implementations of these languages are available for several personal computer operating systems, and for DEC VMS.
Volunteers for ports of tgrind to other operating systems will be most welcome!
Keys are either Boolean flags, in which case they take no =value string (the flag is set true if the key is present, and false if it is absent), or else string variables whose values are specialized patterns, jokingly referred to as irregular expressions, vaguely similar to the regular expressions recognized by the UNIX ex(1) editor and lex(1) lexical-analyzer generator.
In tgrind patterns, the characters `$', `(', `)', `:', `?', `^', `|', and `\' are reserved characters: they must be quoted with a preceding \ if they are to be interpreted as normal characters. Otherwise, they have these meanings:
The extended patterns are:
- ^
- Beginning of line.
- $
- End of line.
- :
- Key-value capability pair delimiter.
- \
- Escape character. Two such characters, \, represent a single backslash.
- \a
- Matches any number of characters (like `.*' in lex(1)).
- \d
- Matches any number of whitespace delimiters (space, tab, newline, start of line).
- \p
- Matches any number of alphanumeric characters. In a procedure definition (the pb key), the string that matches this symbol is used as the procedure name.
- |
- Alternation.
- (\^)
- Grouping, used mostly for alternation and optionality.
- ?
- Last item is optional (i.e. occurs zero times, or one time).
- \
- Preceding any string means that the string will not match an input string if the input string is preceded by an escape character (\). This is typically used for languages (like C) that can include the string delimiter in a string by escaping it.
Unlike other implementations of regular expressions, these patterns match words and not characters. Hence something like (foo\^|\^bar)mumble? would match foo, bar, foomumble, or barmumble. In tgrind patterns, alternation binds very tightly, so grouping parentheses are likely to be necessary in expressions involving alternation.
Here are the capability keys that are currently used in the vgrindefs file, and in the source code file, tfontedpr.c:
The string value of id and kw is treated as an ordinary string, rather than a pattern: backslash has significance only at end-of-line, or before a colon.
- ab
- Alternate comment begin.
- ae
- Alternate comment end.
- bb
- Begin statement block.
- be
- End statement block.
- cb
- Comment begin.
- ce
- Comment end.
- ic
- Define extra characters that may appear as initial characters of procedure names (those that match \p) and keywords, beyond the hard-wired defaults of letters, digits, and underscore. This supports languages that place restrictions on the initial characters of identifiers. If this key is not provided, then initial characters are treated the same as non-initial characters. This key does not exist in vgrind(1) implementations.
- id
- Define extra characters that may appear in procedure names (those that match \p) and keywords, beyond the hard-wired defaults of letters, digits, and underscore. This supports languages, like Lisp and TeX, that have a more extensive character set for identifiers. This key does not exist in older vgrind(1) implementations; it may have been introduced first by Sun Microsystems in the Solaris 2.x operating system release.
- kw
- Language keywords (a space separated list, usually in alphabetical order for readability, though that is not a requirement).
- lb
- Literal string begin.
- le
- Literal string end.
- nc
- Define characters that may not appear as initial characters of procedure names (those that match \p) and keywords. This provides a way to remove initial identifier characters from the hard-wired defaults of letters, digits, and underscore. Its value is examined after any ic value. This key is not available in vgrind(1).
- ni
- Define characters that may not appear in procedure names (those that match \p) and keywords. This provides a way to remove identifier characters from the hard-wired defaults of letters, digits, and underscore. Its value is examined after any id value. This key is unique to tgrind; it is not available in vgrind(1).
- oc
- (Boolean) one case flag: letter case is not significant.
- pb
- Procedure (function, subroutine) begin.
- sb
- Character string begin.
- se
- Character string end.
- tc
- If this key appears, it must be last. Its value is the name of another vgrindefs entry that is looked up and appended to the end of the current entry, minus the initial entry names. That entry in turn may end with a tc key that refers to yet another entry, and so on, up to a limit of 32 (to catch unterminating loops). If the same key appears more than once in the constructed entry, only the first value is used. Thus, tc can be used to prepare minor variations on a basic language definition.
- tl
- (Boolean) top lex flag: procedures may be defined only at top level, that is, nested procedures are not permitted.
Keys are always exactly two characters long, and the equals sign that separates them from their values must follow immediately, without intervening whitespace.
If you need a single backslash in a string, represent it like this: :id=\:. vgrind(1), and older versions of tgrind, do not permit this, because their simplistic scan assumes that backslash-colon does not terminate the string. Alternatively, since backslash is significant only before colon and newline in id and ni strings, you could also write :id=\a:, since `a' is already in the identifier character set.
Let's dissect a typical entry to see how this works:
modula2|mod2|m2:\ :pb=(^\d?(procedure|function|module)\d\p\d|\(|;|\:):\ :bb=\d(begin|case|for|if|loop|record|repeat|while|with)\d:\ :be=\dend|;:\ :cb={:\ :ce=}:\ :ab=\(*:\ :ae=*\):\ :sb=":\ :se=":\ :oc:\ :kw=and array begin by case const definition div \ do else elsif end exit export for from if \ implementation import in loop mod module not of \ or pointer procedure qualified record repeat \ return set then to type until var while with:
Each line after the first conventionally begins with a tab, although this is not required, and if the next character is a colon, a key name follows. Terminal backslashes indicate line continuation.
Multiple key=value pairs can be given on one line, as long as they are separated by colons, so at the loss of readability, we could compact seven lines of this entry into just one, like this:
:cb={:ce=}:ab=\(*:ae=*\):sb=":se=":oc:\
The first line in our sample entry says that this language may be named modula2, mod2, or m2 in the tgrind -l option.
The pb line says that a procedure definition begins a line with optional whitespace, followed by one of the keywords procedure, function, or module, followed by optional whitespace, followed by an alphanumeric procedure name. That name in turn may be followed by whitespace, an open parenthesis, a semicolon, or a colon, thanks to the tight binding of alternation. It would have been clearer to include grouping parentheses, writing `(\d|\(|;|\:)'.
The bb line says that a statement block starts with optional whitespace, and one of the keywords begin ... with, and the be line says a statement block ends with optional whitespace, followed by either the keyword end, or a semicolon. The cb and ce lines say that comments are delimited by braces, and the ab and ae lines say that comments may also be delimited by (* *). The sb and se lines say that strings are delimited by quotation marks, and the oc flag says that letter case is not significant in names (this seems to be in error: Niklaus Wirth's Programming in Modula-2, Springer-Verlag (1983), ISBN 0-387-12206-0, says that upper and lower case letters are distinct). Finally, the kw lines enumerate all of the Modula-2 language keywords, from and to with.
\def \FontName {NewCenturySchoolbook}followed by a line
\input tgrindmacThe interpretation of the font name is handled entirely in the TeX file, tgrindmac.tex. For this example, a line in that file says
\ifstreq{\FontName}{NewCenturySchoolbook} \def \FontName {pnc} \fiThis replaces the definition of \FontName with pnc. A few lines later, we find
\ifstreq{\FontName}{pnc} \setfonts pnc r ri b. \fiWhen \FontName has the value pnc, \setfonts is executed with four arguments: the basename of the virtual font, and the suffixes to be added to it to name upright, italic, and bold fonts.
Thus, TeX will expect to find in its TEXFONTS search path the TeX font metric files pncr.tfm, pncri.tfm, and pncb.tfm, and the DVI driver will expect to find in the same search path the virtual font files pncr.vf, pncri.vf, and pncb.vf.
Besides these files, \setfonts will generate references to fonts pcrro and psyr for typewriter text and special symbols, so TeX will also need pcrro.tfm and psyr.tfm, and the DVI driver will need pcrro.vf and psyr.vf.
If all of the referenced fonts exist on the system, TeX and the DVI driver will handle the rest of the job automatically, and if you added two lines similar to the ones above to a private copy of tgrindmac.tex to define a new font family, you can stop reading this section now.
However, here's what goes on behind the scenes. The virtual font files contain references to the so-called `raw' TeX font metric files, prefixed by a letter `r', in this case, rpcrro.tfm and rpsyr.tfm. The correspondence between these raw font metric files and the actual long \*(Ps font names, such as Courier-Oblique and Symbol, is made in the psfonts.map file, with lines like these:
rpcrro Courier-Oblique rpsyr Symbol ... rptmro Times-Roman ".167 SlantFont" ... putb0 Utopia-Bold <putb0.pfb putbo0 Utopia-Bold ".167 SlantFont " <putb0.pfbThe first two simply identify the mapping between a file name and a font name. The third additionally specifies that the Times-Roman font is to be slanted to the right by one-sixth, to synthesize an oblique Times-Roman. The fourth tells the DVI driver that the font definition must be downloaded from the putb0.pfb Type 1 \*(Ps binary font file, and the fifth specifies both a slant and a source file to be downloaded.
/usr/local/lib/tex/inputs/doublecol.texdouble-column plain TeX macro package /usr/local/lib/tex/inputs/psfonts.mapDVI driver \*(Ps font mapping file /usr/local/lib/tex/inputs/tgrindex.awkindexing program /usr/local/lib/tex/inputs/tgrindex.texindexing macro package /usr/local/lib/tex/inputs/tgrindmac.textgrind macro package /usr/local/bin/tfontedprtgrind preprocessor program /usr/local/lib/tex/inputs/vgrindefslanguage descriptions
Extensions for \*(Ps fonts, procedure indexing, space after the -l option, the -o option, the ic, id, nc, and ni keywords, language support for Ada, awk, bash, BibTeX, ANSI/ISO Standard C, C++, CAML, Elisp, Fortran, ksh, LaTeX, Maple, Matlab, MLisp, Miranda, Objective C, PostScript, Russell, Sftran3, and tcsh, plus major revisions of documentation and source distribution, by
Nelson H. F. Beebe
Center for Scientific Computing
Department of Mathematics
University of Utah
Salt Lake City, UT 84112
USA
E-mail: <[email protected]>.
finger [email protected]