The LaTeX2HTML Translator |
|
Drakos/Moore |
Alternate Font Encodings
LATEX2HTML can interpret input using 8-bit fonts,
provided it is told which font-encoding is being used.
This can be done by appending an ``extension'' option
to the -html_version command-line switch; e.g.
latex2html -html_version 3.2,latin2 .... myfile.doc
declares that any 8-bit characters in the LATEX source
within the file myfile.doc are to be interpreted
according to the ISO-8859-2 (ISO-Latin2) font encoding,
rather than the default of ISO-8859-1 (ISO-Latin1).
Furthermore, ISO-10646 (Unicode) entities can be embedded within
the output produced by LATEX2HTML.
For this a further ``extension'' option is appended; viz.
latex2html -html_version 3.2,latin2,unicode .... myfile.doc
declares that the input is ISO-Latin2, but that 8-bit characters
be output as the corresponding Unicode number.
For example, e.g. the Polish would become Ł.
Otherwise the browser might render the character as £which is the character in the corresponding place for ISO-Latin1.
The input encodings that are recognised are listed in
the following table.
Table 1:
Supported Font-encodings
extension |
notes |
encoding |
unicode |
(partial) |
ISO-10646 (Unicode) |
latin1 |
(default) |
ISO-8859-1 (ISO-Latin-1) |
latin2 |
|
ISO-8859-2 (ISO-Latin-2) |
latin3 |
|
ISO-8859-3 (ISO-Latin-3) |
latin4 |
|
ISO-8859-4 (ISO-Latin-4) |
latin5 |
|
ISO-8859-9 (ISO-Latin-5) |
latin6 |
|
ISO-8859-10 (ISO-Latin-6) |
|
If multiple extension options are requested, then later ones
override earlier ones.
Only in rare circumstances should it be necessary to do this.
For example, if the latter encoding does not define characters
in certain places, but an earlier encoding does so, and these
characters occur within the source.
In this case the unicode extension ought to be loaded also,
else browsers may get quite confused about what to render.