If you click on [T], you can typeset the HTML document that you have opened or created, converting it into high-quality Portable Document Format with the help of the TeX engine, a software developed by Professor Donald Ervin Knuth and his students at Stanford University around the year 1980. The kernel of TeX version 3.141592x is frozen and considered to be error-free, in contrast to most other software of that size, but the system still is flexible enough to be extended.
1996 and 1998 support for right-to-left typesetting was implemented
by Peter Breitenlohner, and Hàn Thế Thành integrated a post-processor which
creates PDF output and embeds PostScript and TrueType fonts as well as JPG and
PNG images. The additional code of both extensions is contained in the
Markup Shredder – which can use this wonderful piece of well-tested open source software – only adds 300 kilobytes of TeX macros and twice that amount of shell, batch and PHP scripts in order to realize an HTML parser and a font metric processor together with three simple user interfaces.
Traditionally, writers had to learn hundreds of proprietary commands to run the TeX typesetting engine. Markup Shredder wants to be TeX made easy, as basic HTMLHTML and CSSCSS knowledge should do now.
Like HTML-Tidy, which is described in the analyse section, GMS performs a syntax check of the input
document’s markup and creates a protocol file that is given the same base
name with a
LOG extension. Each tag that has been processed is
listed herein, and error messages are inserted wherever the GMS macro layer or
the TeX engine itself detect any irregularity.
HTML-Tidy and GMS do not always agree in their syntax check
results. For instance, Markup Shredder might mention missing end
tags or missing quotation marks around attribute values where
HTML-Tidy gives a strict HTML rating without warnings, because GMS tries
to help authors to write valid XHTML documents. Another reason is that
Markup Shredder does not test whether lower case is used for all
element and attribute names in documents that claim to
be of type XHTML. GMS also tolerates characters
Ÿ, which are not allowed in the specifications, though
important things like dashes, Euro sign, French œ ligatures
and German „gaensefuesschens“ could be placed there by
It is instructive to compare both test results with the HTMLHTML and XHTMLXHTML specifications.
At the end of the log, you find a link to the PDF output file. If GMS runs on a remote network computer, you can download and save it, if you click on the link with the right mouse button. In the context menu select save target as.
You can typeset a given HTML file by pressing [T]. The result of the syntax check will be displayed in the text viewer; press [Q] (Linux) or [Esc] (Dos, Windows) to quit.
You can select another TeX binary (
pdfetex), if you press
[S] and [P]. The TeX binary
and its associated message
pool file should be found in the
search path or in
[GMS_BINARIES], a sub-directory of
Windows) to typeset a markup document. If the file
was opened or created before, it is sufficient to call
Alternatively, you can execute the command
/default/default.htm, if the
[TEXINPUTS] variable is set to
Thanks to the operating system’s disk buffer, the second typesetting run of an HTML input will be a bit faster than the first. In the text mode interface, the process may be accelerated on Windows NT/XP, if the window is hidden. GMS will prompt you in the task bar whether it is still busy, or how long the run has taken. In the web browser interface, as well as in the other interfaces on Windows 9x, you do not get any feedback about the typesetting progress before it is finished, though it might take some time.
Sometimes you may discover an error in your input file while TeX is still working. In the text mode interface or in the command line, you can cancel the process by pressing [Ctrl+C]. If a question comes what to do next, answer [X][Enter] to exit at this point or [Ctrl+C] again to cancel output production. On Dos and Windows, you are also asked: Cancel batch process (Y/N)? Answer [N][Enter] to return to the GMS menu.
If TeX should break with an exhausted memory message on a large
document, change the value that is assigned to the
texmf.cnf and initialize the TeX format
The pdfTeX engine can embed images, but only those of type JPG (for photographs), PNG (for graphics with a reduced number of colors), or PDF (e.g. single pages created with pdfTeX and LaTeX). The popular GIF image file format is not supported, but you do not have to modify your HTML document: Just provide a JPG or PNG file with the same base name for every inserted GIF image and place it into the same directory.
So you can have low-resolution GIF images which are displayed by browsers, while GMS, when processing GIF requests, will look for matching JPG or PNG files that may have a higher resolution. If such an image file cannot be found, however, the replacement function may return a different file with the same name from another directory within the document search path.
You may change your document to use only PNG images, avoiding to ship them in two different data formats, but Internet Explorer 3x and early Netscape Navigator 4x do not render PNG images. Using a JPG replacement for a GIF image will usually lead to a larger file size or loss of quality. GMS treats 1in = 25.4mm = 72pt = 72px, or 1px = 1pt, just for ease of page design, though this is not recommended by the CSS2 specification.
Here’s the main trick to fine-tune an HTML document for print via GMS and Acrobat Reader without changing its appearance in a browser on screen:
print.cssand link them to the document’s
classattribute to its definition tag,
"td1">, and an entry like
print.css(then you can still rely on your browser’s auto-width function for screen rendering), or add
print.css. Now, if you start your document body with
"noscreen">evening</span>, the rendering reads: Good evening! –The
mediaattribute applies to the
<style>element as well, so you do not need external
mediavalue and thus forcing authors to optimize their pages for Microsoft products. So it may be necessary to load an empty dummy file as last style sheet.
You can select the language of your document and
the corresponding hyphenation rules by saying
"en-UK">, for example. The
codes for the representation of names of languages are defined in ISO 639.2. If you
discover wrong or missing hyphenation in the output PDF produced by GMS, use
soft hyphens, like
ap­pen­dix. Old browsers like Internet Explorer 3x
and Netscape Navigator 4x, however, display
­ as a dash,
being always visible.
Between 1996 and 2006, reformed spelling and hyphenation rules for German
were established. In GMS, you can enable reformed German hyphenation
rules by saying
"de-rf"> in your document. For traditional texts, the
declaration, however, is insufficient because there are extra
rules to modify the spelling of some words when they are split between
lines. Thus you have to write
Be<span>tt</span>uch to tell GMS to take a
sonderweg leading to Bäcker/Bäk-ker and
In any language, if a word appears to be split in a way that you do not
like, you can restrict hyphenation to the desired places by inserting soft
­. A word will not be split if it is
enclosed by an inline level element like
unless a trailing space or punctuation mark is enclosed too.
Since version 0.06a, Gerolf Markup Shredder supports
genealogical data markup according to the Gedcom XML 6.0 Specification. While you can open the example file
gedcom60.xml directly with the text mode interface, the web browser
interface will refuse to do so. Therefore a copy of this file named
gedcom60.htm still is required.
For other languages than English, you have to modify the generated content
which is defined at the end of
"}’ for German. Internet Explorer
does not generate this content, so use Mozilla Firefox or Opera for
gedcom60.pdf output file produced by GMS includes
images as generated content for
if JPG or PNG files can be found locally.