3 Target formats of typeset

Typeset supports various ``target formats'' i.e., backends. All these formats have some special properties, are supported to different extent and are partially customizable. Here details are discussed.

3.1 PostScript

The PostScript output is generated by a special PostScript formatter. Typeset generates only the input for this formatter and invokes it for the PostScript output.

There is support to choose the backend formatter in principle, but currently only Lout and LaTeX is supported. Future versions will at least support LaTeX + dvips.

Why is Lout the default? Just because it's the one simple enough that I learned a bit how to program it. And it supports most of the features someone needs to produce neat documents. Unfortunately it has some drawbacks discussed in the Lout section (3.3). I must also admit that LaTeX output looks a little better, but that is a matter of taste. See LaTeX section (3.4) how to use its output.

3.2 ASCII

For the ASCII target applies the same as for PostScript. The default formatter is also Lout. Maybe some day there will be support for roff. See Lout section (3.3) for discussion.

3.3 Lout

Lout is the default PostScript backend of typeset.

Even if not described in the standard values for the -O Option also lout is a valid parameter to the -O option. It's not described among the standard values because it's not considered to be a final target.

The most important features of Lout are it's easy to learn programming language, it's capability to denote equations, pictures, graphs and even ``special effects'' (like rotated text) without much hassle and the fact that it's relatively small on the disk.

All the document types which come with typeset have a representation in the Lout target. For the brief it's the only target defined. See (3.3.3) about the manpage document type.

3.3.1 The face attribute

The face attribute is a string of tokens delimited by spaces.

The following values of the face attribute are recognized:

1S 2S
These change between one side and double side printing.
1C 2C
These switch between one column and two column for the main text. Some text chunks are one column in any case (e.g.,the table of contents, abstract, intro, preface) index text is two column in any case.
nidx
This suppresses the generation of an index even though <index> tags are used in the document.

By default simple documents use face=``1S 1C nidx'', report use face=``1S 2C'' and books face=``2S 1C''.

3.3.2 Notations with Lout

For all notations suitable to be used in conjunction with the Lout target there must be a way to convert the code either into Lout code or into Encapsulated PostScript.

Typeset as it comes has these conversion defined for a couple of notations (e.g., notation fig for figures drawn with xfig).

Notations which are converted into Encapsulated PostScript can't really determine if the .eps file is no longer necessary. Therefore it is not deleted by typeset.

It is always a good idea to move to an empty directory to invoke typeset and supply the full file name to the source. This way you will end up with your document and you can see with ease which files are no longer needed.

3.3.3 Problems

But the old Lout had also its drawbacks. Unfortunately typeset reveals bugs and after all misfeatures all the time. Those keep me for instance from a better table implementation.

The old version of Lout were unable to break indented chunks of text over columns and pages. This is solved by the new release, but typeset revealed a bug in that code. Thus there is a chance left to get bitten by this bug. If it appears, long junks will not be broken. Moreover the spacing between the paragraphs behaves very funny in this case. It is likely to happen only when you have nested <desc> elements and more than one paragraph in the inner description list. If this appears to you please send me a test file.

An finally there is the problem with Lout, that it may fail sometimes to resolve cross references on heavy loaded machines or if disk space is ``tight''. (Where tight could be a couple of Megabytes.) This should be circumvented now because most cross referencing is done without the lout (only page references are left to lout). But I can't test it.

3.3.4 Customization

Lout has its own way for extensive customization. This includes all sort of things like hyphenation exceptions, definitions of new symbols, page background and so on. This is not the place to discuss them. See the documentation of Lout for details.

Typeset supports the widely necessary customizations. To change these options go to the file layout.scm in the include directory of the installation. The following options are customizable:

page size
Only the named page types of Lout are supported. These are: Letter, Tabloid Ledger, Legal, Statment, Executive, A3, A4, A5, B4, B5, Folio, Quarto and 10x14. Change the definition of lout-pagetype to the size you want.
page margins
The left and right margins of the document are set from typeset due to the request of single or double page printing.

For single page printing both margins are set to the same value lout-both-margin, for double page printing the left side of odd and the right side of even pages are the same lout-inner-margin and the other are set to lout-outer-margin.

initial break
The initial break used by typeset is also open to customization. But it's very unlikely that someone need to change it.

Future versions will open the formating of paragraphs to customization.

3.4 LaTeX

You can produce LaTeX files from typeset. To produce DVI file you need to run LaTeX on the result by hand. For PostScript output through LaTeX you also need to run dvips.

Currently only the DTD's book, report and document are supported for LaTeX.

The file is usually ready to be feed to LaTeX. There is no need for bibtex or makeindex. Eventually you might adjust the substyles used. See the customization section (3.4.4) how to prevent this.

3.4.1 The face attribute

The same values as for Lout are supported see (3.3.1) for description.

3.4.2 Notations with LaTeX

The following notations are supported with LaTeX:

eps
encapsulated postscript. Note: a temporary eps file is copied into the current directory and left there
fig
xfig drawn pictures
latex
Code unconditional passed to the backend
lout
plain lout code
lfig
lout code using the @Fig package of Lout
roff
roff code which will be converted into eps by groff.
tgif
tgif drawn pictures

As for some other target formats it is a good idea to format the document in a empty directory allowing you to delete files left over with ease.

3.4.3 Problems

LaTeX is huge and most installations differ. The package aims to run at most plain installations. Only the inclusion of EPS figures depends on either the substyle epsfig or the graphicspackage.

Which one is used has to be defined in the installation process.

Also to be defined in the installation process is whether LaTeX version 2.09 or LaTeX2e code is generated.

Typeset aims to be a ``don't worry application'' for the author. Regarding LaTeX this implies that some characters are set in a save way regardless whether or not this is necessary in a certain case. E.g., because the characters < and > yield strange results in some environments they are set using {\tt >} respectively {\tt >} except within <math> elements.

Also LaTeX offers an extensive set of symbols to write math formulas. Unfortunately it is quite hard to generate them proper for what typeset already offers. So better use sdc's way only for simple formulas and <inline latex> for complicated formulas which are written in native LaTeX notation then.

I've also got a report about a version which discard the leading (or all?) blanks in <verb> elements.

3.4.4 Customization

To customize the LaTeX output you need to change the file include/layout.scm. The following variables are available:

latex-latex-type
Switch between LaTeX 2.09 or LaTeX2e style.
latex-styleoptions
This is a list of strings. By default only ``epsfig'' is included. For LaTeX 2.09 this seems to be a good choice, for LaTeX2E it's probably the empty list.
latex-packages
A list of packages to be included by \usepackage into the LaTeX source. By default only [dvips]{graphics} is in this list.
latex-preamble
This is a list of strings which are put into the preamble of the document just before the \begin{document}. By default latex-a4-preamble is included into this list.

For German documents the substyle german will be included into the list of substyles. For further customization (e.g., other languages) you'll need to go into the code of target/latex/preparse.scm. If you do please drop me a note so I can include the support.

3.5 HTML

3.5.1 The face attribute

Only one token is used from the face attribute: nidx. If it is present the generation of an index is suppressed.

3.5.2 Splitting

Except for the book document type one document becomes one HTML file (defined by the -o option to typeset). For the document type book the document is split at the chapter boundaries into single files. For the names of these files the base name of name given to the -o option is extended by -number. E.g., given a command line of:

typeset -O html -o doc doc.sgml

And assumed the file doc.sgml contains three chapters, you will end up with (at least) 6 files named doc.html, doc-1.html, doc-2.html ... doc-6.html. These contain the top level stuff in doc.html, the single chapters -- one per file -- and also the index and references get their own file.

If entities of other notations are used (e.g., figures) you will get some more files following the same naming scheme (with different extensions -- most likely ``.gif''). Refer to (3.5.3).

3.5.3 Notations with HTML

For the HTML target, notations are best converted into GIF files. To do this conversion typeset will by default invoke ghostscript and the pbm-tools for Encapsulated PostScript. The files are named after one of two schemes. For external entities which already have a filename, the basename of it is used (with extension ``.gif''). If the notation is used inline, a new ``subfile'' is created with the basename of the target file (given to the -o option) appended with a dash and a running number. The extension is also ``.gif''. See the splitting section (3.5.2) for details.

3.5.4 Problem

Because of the possible splitting the filenames of the file(s) produced are compiled into the output. Therefor you can't rename them after compilation anymore. You need to be aware of this and give a filename without any leading directory component to the -o. option.

3.5.5 Customization

There is not too much to be customized for the HTML target. Only for manpages: the file manpage.scm in the include directory is to be changed. The procedure html-make-man-ref receives a string (the id given to a <ref t=m id=string// tag) and has to return a list of string. The concatenation of these strings must form valid HTML code (either a URL to some server providing man pages or text to be included in the document.

3.6 Info

Only the document types document, report and book have a representation in the info format. Due to the limitation to plain text some, tags like character formating are ignored.

The face attribute is completely ignored.

For each division of text (i.e., the whole document/ report/ book, chapters, section and subsections) a node, is generated with a menu containing each division of the next level.

Other than the preceding version, the Info output is no longer spread into different files. Todays computers are usually powerful enough not to need this, and it would blow up the code too much.

Only notations which have a plain text representation can be used with the info target.

3.7 man

For the manpage target only the document type manpage is really supported. The intention of this target is not to print everything from the man command but to produce pages suitable to be stored in the online manual of Unix systems.

To produce a printed version of manpages you should use the manpage produced by typeset and feed it into the nroff command of your system.

Only the notation roff is supported with this target.

Customization

The file manpage.scm in the include hold a translation table from the symbolic short name of the section into the numbers and the long names. Some systems use a different order (numbering). Adapt it to reflect your system.

3.8 Literate Programming

For literate programming the filenames written are part of the program hence determined within the document. Only if no file name is given for a part of literate program it's written to the standard output.

There is nothing to be customized for this target.

3.9 RTF

The RTF target is the least supported one. Only the document types document and report are supported by now and even these not to the full extend.

The RTF target is indented to open a way to use parts of a document with the widespread MS tools. Unfortunately it's pretty hard to support this format. Even page size and font information is stored in and to make it worse the ``reference application'' (guess which) treats it different than the definition published by Microsoft.

But for the restrictions word processors carry, RTF formated text is not supposed to look as professional as others anyway.

3.10 Slide

The slide target works similar to the literate target. Only the parts of the source enclosed by the <slide> tag are extracted. These are formatted using the lout backend with it's overhead transparencies support.