HTML as TeX replacement

Stuart notes Lee Phillips’s critique of HTML compared to TEΧ

However Phillips’s idea of HTML is not quite up to date; he ignores how CSS and SVG combine with HTML to add richer typography. First he complains about hyphenation and ligatures. Hyphens are in CSS Text Level 3 and are implemented in many browsers though not yet Chrome. Ligatures are in CSS Fonts Level 3 and supported in many browsers too — Apple has done it for years. Here we have the TEΧ example, live rendering from your browser, and what Safari Mac [and Firefox Mac] made of it. Note the hyphenation and the ligatures. Also, I took out the spaces around the em-dashes that Lee Phillips oddly put in.

Call me Ishmael. Some years ago​—​never mind how long precisely​—​having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world. It is a way I have of driving off the spleen and regulating the circulation. Whenever I find myself growing grim about the mouth; whenever it is a damp, drizzly November in my soul; whenever I find myself involuntarily pausing before coffin warehouses, and bringing up the rear of every funeral I meet; and especially whenever my hypos get such an upper hand of me, that it requires a strong moral principle to prevent me from deliberately stepping into the street, and methodically knocking people’s hats off​—​then, I account it high time to get to sea as soon as I can.

Next Phillips takes on mathematical equations. His first example is e = −1. Note how that was displayed fine inline, just by using <sup>, which has been in HTML for years, along with <sub> which I used to show the TEΧ e. Writing in utf8 means I don’t need a special sequence like \pi for π.

Phillips is right that doing more complex equation layout in pure HTML is difficult. Fortunately, we do have SVG for arbitrarily precise positioning of text and graphics. I took his example of Stokes equation, and put it through Troy Henderson's LaTeX Previewer (which I found by googling 'tex to svg'). Here we are:

Top

Here is the elementary version of Stokes' Theorem:

Go to top

Now, the SVG there, though scalable, is not ideal - it renders as paths, not characters. If I use SVG text, I can get it selectable:

𝝨∇×𝐅∙𝑑𝝨 = ∂𝝨 𝐅∙𝑑𝐫

Here's the SVG code for that. You can see the tighter control.

<svg xmlns="http://www.w3.org/2000/svg" width="200" height="44" >
<text x="0" y="30" style="line-height:125%; font-size:18px; font-family:Serif;">
 <tspan style="font-size:36px;">∫</tspan>
 <tspan style="font-size:12px;" baseline-shift="sub">𝝨</tspan>
 ∇×𝐅∙𝑑𝝨 = <tspan style="font-size:36px;">∮</tspan>
 <tspan style="font-size:12px;" baseline-shift="sub" dx="-7px">∂𝝨</tspan> 𝐅∙𝑑𝐫
</text>
</svg>

Here is how Chrome Mac and Safari Mac render this:

However, you may not see all the glyphs, as I am using the special unicode characters for Mathematical letters, and your browser or device may not have those.(Update - I included the STIX font so you should see them now). Here's a version with ordinary latin and greek letters:

Σ∇×F∙dΣ = ∂Σ F∙dr

Phillips may be superficially right that HTML doesn't give as much typographic control as TEΧ, but when you compare to the full web suite, including CSS and SVG, that conclusion can't be sustained; indeed even his point about macros could be solved by using javascript as well, though I prefer my web pages to be declarative.

That said, many of the CSS specs I have linked to are still being edited, so this is a good time to try out authoring your mathematical papers that way and possibly proposing changes.

Update 2015-10-09

Lee Phillips has posted a reply.

There is a clash of worldviews going on here, that reminds me of this discussion. TeX and HTML have a lot in common, in that they are both plaintext authoring environments for documents, but they have philosophical differences too, in that TeX is meant to be compiled into a specific pagination, and HTML is meant to flow dynamically. I'm sorry that my initial article came off as rhetorical point-scoring; I was genuinely trying to work out which parts of the HTML+SVG+CSS+JS toolkit were in a state to represent maths well.

What I was clumsily trying was to show that HTML+SVG output from TeX could be better than the current default, which is HTML+bitmaps, or PDF. The way we get better support for ligatures, hyphenation and justification in browsers is by trying them out, seeing where it goes wrong and sharing these cases with the spec editors and browser writers too.

Now to address Phillips's specific points there. Yes, I am updating markup in this post. My initial pass at representing equations with SVG text looked good in the Mac browsers I was using, and my Chromebook, but that was relying on font substitution for the Mathematical letters, which I later realised were rare in actual unicode fonts. The edit history is in my github repository. Apologies for not making that clear. Do send me a pull request if you have further edits.

My choice of "Hoefler Text" as default typeface for this site has also apparently been causing layout problems John Lenton got all caps on ubuntu and Firefox seems to use really wide spaces for it. I've put in 'Computer Modern Serif' which Firefox seems to prefer. HTML font handling is fiddly and annoying - I now appreciate TypeKit a lot more.

Our em-dash differences are clearly one of those typographic holy wars, like Oxford commas, that generate more heat than light​—​the Gutenburg text I took the Moby Dick from was without spaces.

I have added the DOCTYPE and lang declarations to encourage Firefox to hyphenate more; that and the typeface change seem to have helped somewhat, but it is still setting the spacing loosely- before and after in Firefox on my mac:

That said, one of the advantages of the HTML worldview is that invalid markup is still rendered readably; again this is a culture clash of compiled versus dynamic worldviews; SVG, being pure XML is draconian like TeX.

Encouraging browsers to adopt the Knuth and Plass justification algorithm, hyphenation and hanging indents is definitely worth doing. I added Microsoft's text-justify: newspaper property which apparently turns it on there. In practice, avoiding text-align: justify on the web (as both myself and Phillips do on our sites) still produces more reliably legible text for now.

My point with <sup> and <sub> was that they suffice for simpler equations. I think we agree that SVG rather than bitmaps are a good idea for complex ones.

My attempt at using SVG Text rather than paths was an experiment, and clearly an unsuccessful one; I ended up with equations that looked good in my browser and looked like crap in many others, and the selectability was not worth the aggravation. I think there is potential for a new TeX to SVG converter that did use SVG Text rather than paths, but I am not the one to write it; SVG as a PDF replacement would be a good thing.

The macros point was Phillips's discussion of TeX as Turing complete and thus hard to translate into HTML. If you want Turing complete macros in HTML, you use Javascript for them was what I was trying to say.

The point about the CSS specs still being edited was to encourage those who are generating and publishing mathematical papers to join in the discussions there, with use cases and examples, to try to get the browser engineers to converge on these typographic issues.

I'm not hostile to TeX and those who find it a productive tool; I do think it could be better translated for the Web than by rendering to PDF. Mostly I'd just like TeX fans to stop making only 2-column fixed-size PDF papers that I can't read on my phone or tablet without a lot of zooming, panning and squinting. I'm sure it is a powerful enough toolbox to do better at that.