Pretty printing

SRC Modula-3 includes a pretty-printer for Modula-3 programs called m3pp. It will take the source code and make decisions on where to place line breaks, how far to indent lines, etc. to produce a supposedly more readable version of the program.

The syntax for calling m3pp is

    m3pp [options] [file]
The pretty-printer reads its input from the specified file. If no file is given, then the pretty-printer reads standard input. The formatted version is always sent to standard output.

The pretty-printer can produce text output suitable for replacing your code (the default) or PostScript output suitable for printing. Some additional options are available when producing PostScript output.

Options

The pretty-printer reads options from $HOME/.m3pp.pro.1, then .m3pp.pro.1 in the current directory, then from the command line. Single-letter abbreviations can be used, but no others are recognized.

-caps Recognize keywords in either lower or upper case and map them to upper case in the output. If this option is not specified, keywords must appear in upper case.

-margin n Set the margin to n. The margin is the maximum width of an output line. The pretty-printer occasionally exceeds this maximum, so it's usually wise to set it a little lower than you want. If this option is not specified, a margin of 75 is used.

-offset n Set the offset to n. The offset is the number of spaces by which a construct inside another one is indented. The default is 2 for text output, and 4 for PostScript output.

-xcolumn n Set the comment column to n. This is the column in which comments that are on the same line as code but are not followed on that line by code will start. To disable comment alignment, specify -xcolumn 1. The default is 33 for text output, 80 for PostScript output.

-src Use the SRC style of formatting (default).

-eric Use Eric's style. "END"s are emitted at the end of lines rather than on a line by themselves. This option cannot be abbreviated.

-callspace This inserts blank space around function calls and array subscripts. "f(x,y)" becomes "f (x, y)" and "a[i]" becomes "a [i]". This option cannot be abbreviated.

-noalign The pretty-printer attempts to align const, variable, and type declarations into columns. This option turns off the alignment. It does not disable the alignment of formal parameter lists when they don't fit on one line.

-whenbreak {always | early | late} When the pretty-printer has to break a line that's too long, you have two stylistic choices:

    1:  AlongFunctionCall(
             withLongParameters, likeThisOne, andThisOne);

    2:  AlongFunctionCall(withLongParameters, likeThisOne,
                          andThisOne);
-whenbreak always tells the pretty-printer to use style 1, always. Otherwise, it will consider BOTH styles, which takes somewhat more time. If style 1 produces fewer lines, it will use it. If they produce the same number of lines, then: -whenbreak early says to use style 1, and -whenbreak late says to use style 2. The arguments can be abbreviated as a, e, or l. The default is "-whenbreak early".

-break This is equivalent to "-whenbreak always".

-follow Normally, the formatter assumes that comments precede declarations or code, and formats them at the same indentation level as whatever follows. This style can also work fine with comments that follow a declaration. This option changes the comment indentation to more closely match the older version of the pretty-printer, assuming that comments follow declarations. With -follow, you get

    CONST
       maxElements = 3;
       (* limit for number of elements *)
    PROCEDURE AddOne(e: INTEGER);
instead of
    CONST
       maxElements = 3;
    (* Add an element. *)
    PROCEDURE AddOne(e: INTEGER);
In either case, a blank line will cause subsequent comments to be aligned with the statements or declarations that follow. You get
    CONST
       maxElements = 3;

    (* Add an element. *)
    PROCEDURE AddOne(e: INTEGER);
in either mode. The chief disadvantage of this mode is that comments in blocks of code are attached to the preceding statement and indented one level unless they are preceded by a blank line.

-ZZ Use emacs filter mode. The formatter loops, accepting and formatting declarations, BEGIN-END groups, or modules. Each chunk of code to be formatted should begin with a control-B (^B) and end with a control-A (^A).

-ZZG Use emacs filter mode on a single chunk (for a different Emacs package).

-text Produce text output. This is the default.

The following options are used when producing PostScript output. Except for -portrait, they cannot be abbreviated. For the font options, valid font names are PostScript names followed by an integer point size (e.g. "Helvetica9").

-ps Produce PostScript output suitable for printing.

-portrait Produce one-column portrait mode output. The default is two-column landscape mode output.

-bf font Set the font used for formatting the body text of the program. The default is Times-Roman10.

-kf font Set the font used for formatting keywords. The default is Helvetica7.

-bif font Set the font used for formatting built-in identifiers. The default is Times-Roman8.

-pf font Set the font used for formatting procedure names. The default is Times-Bold10.

-cf font Set the font used for formatting comments. The default is Times-Italic10.

-fcf font Set the font used for formatting comments that aren't refilled (see below). The default is Courier-Oblique9.

-ff font Set the font used for formatting text and char literals. The default is Courier9.

Features

The pretty-printer fills long comments; that is, it chooses line breaks inside long comments so as to fill up each line with words. This is usually convenient, but occasionally disastrous: a carefully formated table, equation, or program in the input will be reduced to a dense paragraph in the output. To spare your formatted comments from this fate, begin the comment with "(*|" or "(**", or begin the comment with "(*" on a line by itself. The old style of marking each line individually is also supported. In filled comments, lines that begin with a vertical bar ("|") in column 1 will not be filled.

Conversely, if you want a comment to be filled, begin the comment with "(*" and place some text on the same line as the comment start. Pragmas are never reformatted. Pragmas and comments that are not reformatted may still be moved left or right to an appropriate indentation level.

The pretty-printer tries hard not to add or remove blank lines from the program. It will add and remove line breaks, however.

The formatting rule used for CASE statements is geared toward programs in which there is no vertical bar after the last case, and no vertical bar before the ELSE clause. The same goes for TYPECASE and TRY statements.

In selecting elements of multi-dimensional arrays, the syntax a[i][j] means the same thing as a[i, j]. The second version will be handled better by the pretty-printer.

The compiler currently ignores the portion of the input file that follows the module-terminating dot. However, the pretty-printer requires that only properly-delimited comments appear in this portion of the file. Therefore, if your file has any comments there, be sure that they are enclosed within comment brackets.

Syntax errors

If the input contains a syntactically valid compilation unit (that is, a definition module, implementation module, or program module) then the output becomes the corresponding pretty-printed version. If the input contains a syntax error, then the output is pretty-printed up to the lexical token that caused the syntax error, then a line is printed consisting of the string "(* SYNTAX ERROR *)" followed by the offending token, an error message is printed to standard error output, and the remainder of the input is copied unchanged into the output. Regardless of whether a syntax error is found, m3pp exits with status code 0. (If the syntax error was caused by an unexpected end of file, then the string "(* SYNTAX ERROR *)" will appear at the end of the output file.)

Exceptions

String constants of more than 500 characters, or input or output lines of more than 500 characters, will overflow internal buffers. In case these or any other run-time errors cause the program to abort, the output file (if specified) will not be modified.

Bugs

The pretty-printer sometimes exceeds the margin limits due to details of the formatting algorithm. The text that exceeds the limit usually consists of closing delimiters, such as right parens and right brackets. Setting the margin a little lower than you want usually does the trick.

The -noalign option still uses the alignment machinery of the formatter, but tells it to reset its idea of the column boundaries after each declaration. The bad side effect of this is that a declaration with a large initializer can be formatted with the initializer snaking down the right side of the page.

If you have alignment on and a record has some fields with initializers, but the last field has no initializer and no trailing semicolon and is followed by a comment on the same line, then the other fields may be aligned improperly. Fix this by adding a semicolon to the last field declaration.

Authors

Bill Kalsow and Eric Muller (using the framework provided by ppmp). David Nichols (Xerox PARC) and Jim Meehan (Adobe) made a large number of improvements and bug fixes. David Nichols added PostScript support, which Bill Schilit (Xerox PARC and Columbia Univ.) ported into the current version.