Saturday, July 18, 2020

Typesetting Beautiful Mathematics in LaTeX, Lyx and MathJax

In this session we introduce LaTeX which is used pervasively in science, mathematics and computing to both typeset documents and render equations for use on the web.


The video for this talk is on [youtube].

The accompanying slides are online [here].


Why LaTeX?

It is worth setting out the value of learning, or at least knowing about, LaTeX. LaTeX is pervasive in the mathematics, science and computing community, where it is used to typeset documents such as papers, journal articles and books, as well as for rendering equations on the web.

Although LaTEX itself is quite old, it is still widely used, and the language itself is assumed in many other platforms and tools.

This is aside from the fact that LaTEX renders mathematics to a very high standard, and is used almost exclusively for publication quality rendering.


The following are sample pages from Ian Goodfellow's seminal 2014 paper on GANs showing extensive typesetting of mathematics that would challenge ordinary word processors.


Typesetting Equations

The following shows the same equation rendered during the PDF export from Google Docs, MS Word and LaTeX.


The top right shows Google Docs' rendering of the equation, and it is clearly the worst, with elements that are not just ugly but illegible. The top left from MS Word is acceptable. The bottom rendering by LaTeX is the most sophisticated, not just clear but also well balanced in terms of spacing and stroke weights.


This set of images shows a a deliberately challenging equation. The LaTeX rendering at the bottom succeeds in laying out glyphs for a multi-level expression. Furthermore, it shows LaTeX can make use of a much wider range of symbols which are impossible or not easy with the other systems, in this case the "maps to" right arrow.


History and Concepts

It is very helpful to understand how LaTeX works - and doesn't work. This avoids the common confusion when it is assumed it works like common word processors.

Donald Knuth is one of the leading computer scientists and mathematicians of the last 100 years. He is mostly known for his encyclopaedic and authoritative The Art of Computer Programming series.


During the 1970s many organisations automated and mechanised, and his publisher replaced the extremely skilled human craftspeople who manually typeset sophisticated mathematics for publication. Dissatisfied with the quality of the new system, he embarked on a journey talking to typesetters and learning as much as he could about their art. In 1978 he released TeX, his software for typesetting text and mathematics, which encapsulated as much of the expertise of typesetters as he could.


Today we consider TeX to be fairly low-level and is not often used directly. Instead, we use other software which provides higher-level abstractions, such as title, chapter, paragraph heading, and so on. LaTeX is the most popular such software, and is in effect, a set of macros around TeX.

TeX, LaTEX are open source, and part of a very rich and active ecosystem, with a vibrant global community.

An important feature of this ecosystem is packages which provide macros for typesetting things much more easily than with plain TeX. Packages exist for a wide range of tasks as small as fractions and tables, and as big as entire document classes. Document classes define a set of standards for particular types of documents, such as journal articles and academic books. Some publishers, such as the IEEE, will provide a document class which define precisely the design for their own journal articles.

Although we don't typically use plain TeX directly, the accompanying video provides a demonstration, to show the nature of the code, and the typographic refinement of the output. The following shows well-judged spacing between the letters F in 'efficiency' and the ligature that replaces FI in 'fiasco'.


The video also demonstrates a minimal LaTeX document (source). The similarities to HTML and other markup languages are significant. We can see how the document class is defines, how additional packages are imported, and how sections like the title and author are tagged.


The output PDF shows a pleasantly typeset page with headings and some mathematics aligned according to the equals sign.


Most practitioners use LaTeX directly, and it is just another language like HTML.

However, not everyone enjoys writing code to write their documents. For them, tools exist which hide the underlying LaTeX and TeX. A leading example is Lyx.


Lyx

Lyx is an open source tool, available for Linux, MacOS and Windows. It looks like a normal word processor, and is intended to be used without ever editing LaTeX source code.

The following shows Lyx being used to create a document, and we can see the main body text, section and chapter headers, and an included graphic, as well as a display equation.


The video walks through the steps creating a new blank document, entering text and also creating both inline and display (paragraph) equations. We also look at a more complex document, a book, which has include child documents for each chapter, and makes use of automated references to chapters and figures.

The video also demonstrates an important principle of TeX, LaTex and Lyx - separation of content from layout. The editors encourage us to write the content, and not manually fiddle with layout and style. The idea is that the TeX engine works out the correct page layout, using the content we provide. In Lyx, this is enforced by removing multiple spaces and line breaks - they are removed as soon as we type them. Lyx is not a WYSIWYG editor, by design. Actually, it is possible to override Lyx and LaTeX's document class standards, but this is not encouraged.

The following shows sample pages from a work-in-progress book, developed with Lyx. The document is a book class, and the defaults are mostly retained. The chapter heading typeface has been changed using Lyx's options.


The quality of layout and typography, thanks to Donald Knuth's TeX, gives a distinct professional impression.


Ease Of Use vs Capability

Word processors like Google Docs and MS Word are easy to use. So we have to be clear when we should invest in learning to use LaTeX or Lyx.

The following a chart often seen is discussions about LaTeX.


The chart shows that for low-complexity documents, word processors like Google Docs and MS Word are adequate and easy to use. As the complexity increases, both in terms of size and richness of content, especially mathematical content, those word processors reach their limits pretty soon, and just before they do, they become more difficult to use.

LaTEX on the other hand has a higher initial barrier because it isn't trivial to create trivial documents. However, as complexity rises, LaTeX shows the capabilities it was designed for - large and rich documents.

Lyx provides a medium sweet spot. It is easy to use for simple documents, and it provides an easy to use user interface covering the majority of uses that LaTeX is otherwise put to. That is, many kinds of documents can be created with Lyx, journal articles and books, letters and reports, without ever needing to write code. However, because it is a simplification, there will be things it can't do. Even in this instance, Lyx provides an "in-house" ability to use selected LaTeX within a Lyx document when only minor amounts of code are required.


Maths On The Web

A second, and growing, use case for rendering mathematics is on the web. We are increasingly writing mathematical content in blogs and online articles.

Modern web browsers implement many web technologies, from webGL to crypto, from webRTC to webXR. One of these is MathML for rendering mathematics. Sadly this standard is not very human friendly, and is considered something other software should create.


For this reason, libraries such as MathJax have emerged. They convert LaTeX style expressions into MathML, or where that isn't available, other rendering mechanisms like svg or html with css.

This is just one example of a tool which assumes a basic knowledge of LaTeX, emphasising the benefits of becoming familiar with it.

MathJax can be used with many web platforms and custom developments too. The video demonstrates how MathJax is integrated into blogger. The tutorial shows how simple inline equations can be created as well as display equations, as well as a sampling of more complex structures like arrays.

This blog has MathJax included by adding the javascript library to the template. The following is an inline equation, created by typing LaTeX between dollar signs, $y=ax^2 + bx + c$.

The following is an example of display mathematics, which uses double dollar signs to indicate a display formula.

$$ y-ax^2 +bx + c $$

The following shows a similar example which uses LaTeX code for creating a coloured bounding box, which is particularly effective on the web, moreso than in print.

$$ \bbox[yellow,5px]
{
e^x=\lim_{n\to\infty} \left( 1+\frac{x}{n} \right)^n
}
$$

The video shows this and other equations being coded and rendered. A really useful online LaTeX equation renderer is at codecogs, and is a good way to experiment or practice.  The following shows a more interesting formula being written in LaTeX and rendered, ready for download to be used elsewhere.



Common Examples of Maths LaTEX

The following presents some LaTeX commands for creating commonly used symbols. Remember to use enclosing dollar signs for inline maths, and double dollar signs for display (paragraph) maths.

LaTeX Code Result
x^2 $$x^2$$
\frac{1}{x} $$\frac{1}{x}$$
\sum_{1}^{n} x^2 $$\sum_{1}^{n} x^2$$
\int_{1}^{\infty} x\,  dx $$\int_{1}^{\infty} x \, dx$$
\ln(x) + \pi  \alpha \beta +
\gamma \rho \sigma + \delta \epsilon
$$\ln(x) + \pi  \alpha \beta + \gamma \rho \sigma + \delta \epsilon$$
\prod_{p \, \text{prime}}  \left( \frac{1}{p} \right) $$\prod_{p \, \text{prime}}  \left( \frac{1}{p} \right)$$
< \ldots  > \leq \sim \rightarrow \subset \supset \subseteq
\supseteq \times \cdot
$$< \ldots > \leq \sim \rightarrow \subset \supset \subseteq \supseteq \times \cdot $$

Fuller references are linked below.


References

This tutorial is intended as an introduction to the concepts behind LaTeX, and not an exhaustive reference to the (very many) capabilities of TeX, LaTeX and the most popular extension packages. There are many good references online and in print.

A very useful collection of MathJax commands, a subset of LaTeX, is collated in a stackexchange post.

Overleaf's explanation of maths specific LaTeX is helpful:

Documentation for Lyx isn't as well organised, but some have produced their own comprehensive guides, including this one (pdf).


Recommended LaTeX distributions: