In depth: Understanding font resolution and glyph substitution in browsers

In depth: Understanding font resolution and glyph substitution in browsers

With Firefox 3.5 it has become possible to use downloadable fonts in the two major browsers of this moment: Firefox and Internet Explorer. But how does font resolution actually work and why does text still look legible when the requested font is not on the installed system? And what is the role of Unicode and why does Wingdings and the likes work in Internet Explorer but not in Firefox? This article will explain these subjects and more to make you understand font resolution and glyph substitution better. In a future article I hope to explain downloadable fonts in this light.

What is a font

There’s a whole lot of misunderstanding about when it comes to discussing fonts. Terminology is often confused or used in the wrong context. I won’t go into deep details about the terminology, you can use Wikipedia in your language for help, for the understanding of this article it is sufficient if you know the following distinctions:

font family what we sometimes mean by font: the whole of italic, oblique and normal versions as well as the different weights a font comes with. A font family has a name like Arial, Times New Roman and Scriptura, but Arial Italic, Arial Bold and Arial Extra Bold are part of the family Arial.
font face same as font family.
font a specific member of a font family. Arial Italic is a font of the font family Arial.
typeface same as font family.
glyph one character that corresponds logically to a codepoint in a given encoding (nowadays usually Unicode codepoint).
codepoint the actual number of character in predefined set, normally we refer to the Unicode codepoints.
weight the “boldness” of a font. Usually a number between 100 (very light) and 900 (very bold). 500 is normal weight.
logotype two or more characters combined to form a single unit, here we use the more common term ligature.
ligature same as logotype

Why worry about fonts

In any simple English text you often wouldn’t worry about fonts at all. You use some standard font in your CSS, use a fallback font and a generic font as a last resort (like serif / sans-serif) and you’re done. At least, that’s what you think.

The problem arises when we use more advanced technologies for writing our article. More advanced then a common desktop browser installment I mean. Suppose you use Word for writing text and then copy it to your CMS, then it is very well possible that Word automatically converts characters to look nicer. It does so while you’re typing. If you use Pagemaker or InDesign, it is likely that certain consecutive characters are replaced with their logotypes. This is a common mistake with many news agents.

The following table shows some of the advanced typography that can be used while editing, but that’s not available on a user’s basic desktop computer:

glyph in action possible renderings
“quoted” “quoted” or quoted or □quoted□
filanthrope lanthrope, lanthrope, lanthrope or filantrope
prӕstor prӕstor, pr?stor or praestor
onomatopœia onomatopœia, onomatopœia or onomatopoeia
中國 ??, □□ or 中國

to be updated: currently the list contains text, I’d like to have several pictures from real-world renderings

Problems in real life

Google guessing oddly

Google guessing oddly

The little list above may seem contrived, butin practice, it happens all the time. The image on the right is what someone saw when he copied an error into his Firefox Mac browser into Google. The original unabridged image comes from this Reddit thread.

Ligatures in browsers

Ligatures in browsers

The second image is a rendering from several posts I found throughout the internet, emphasizing that the problems with auto-replacement in editing software are very real.

The first rendering is from the original rendering of the text “lucrative job offers” in Firefox on the Mac of this site (ligature FF is now replaced with two letters F). The second rendering uses Safari on the Mac and is from the same site, quite better, but still clear that there’s a font-switch.

The third is from this white paper on Usenet and is rendered with Opera on Windows Vista, you can see the whole word influenced by incorrect font switching. The fourth rendering is from the same  page but now in Internet Explorer 6, it doesn’t find a font with the ligature in it.

Searching on Google News for the word “efficience” (with ligature ffi) renders oddly in the caption bar (all browsers share the same problem on Windows). Finally, a correct rendering of the OE ligature in this tale, all browsers do this correctly, even IE6.

Ligatures fl and ff

Ligatures fl and ff

The next picture, on the right again, shows an example of a very current and active page of The Guardian, where the ligatures fi and fl are used, but you cannot possibly see the difference with the surrounding type. That is, because the computer where I viewed this page on, has these ligatures available in the same font that is used for the surrounding text. If you view the same text on a Mac or Linux, it will not look so nice (if you happen to have Linux or Mac, I’d love to add a screenshot of this text for comparison, please add a comment if you have one).
If you want to know how your browser renders certain ligatures, check this short list on Wikipedia and how it shows on your browser / system.

Ligatures are not the only problem that browser developers face when finding a suitable font for a certain character. Any ligature can be replaced by a corresponding set of characters. Actually, the Unicode consortium strongly discourages the use of ligatures and claims to not add anymore ligatures to the Unicode codebase, the current ones being there only for compatibility with existing code pages. It is a matter of font designers to offer correct kerning for ligatures.

This never happens to me!

That’s what they always say. The reasoning in someones head goes something like “this is too difficult, I never did anything with typography, I just use fonts and HTML, not my problem”. Though this may be true for your situation, I’ve seen a terrible lot of situations pass by were Word, WordPerfect, OpenOffice Write, Ami Pro, Pagemaker, QuarkXpress, InDesign or any other graphical editing software package was used for creating editorial texts. Nothing wrong with that per se, but in all but a few cases theses texts contain auto-generated symbols, specifically inserted symbols and local fonts.

When you view your web page on your local computer, nothing odd happens. Why is that? Because you have the correct fonts already installed. You use Word and the large Unicode fonts that come with it. Or you installed extra fonts that contain your characters. But most people don’t have these fonts installed and browsers will try to render your page as intended, but can only do so with the fonts on the user’s machine. Result: broken layout or even missing text.

If you want to understand why this is happening to you too, even when you don’t look, read the rest of this article, which will explain how fonts are selected by browsers to render your page as much as possible as you expect it.

Fonts you can trust

Ok, so you found out that it can happen, even to you. The first step towards success if making sure you are aware of the available fonts on most systems. Whatever fancy font you select through Word or QuarkXpress, make sure it renders at least legible with the closest available font your can find. These fonts are never exactly equal on Mac, Linux, Windows and may even render differently on the same system between different browsers.

There’s a list of fonts your can trust. But don’t trust them too deeply! The default set of glyphs supported on a newly installed Windows system is very limited. Many fancy characters, digraphs, ligatures or foreign symbols do not have their equivalent in these fonts. The only real way to find out whether your post will render readable is to test on different freshly installed systems.

Font selection strategy

It may come as a surprise, but rendering a page with text is much harder then rendering a page with images. An image is easy: each pixel just needs to be translated to a DIB, which will be drawn on the surface of the browser’s viewport. Text is much harder. Each character that must be rendered poses a list of challenges and questions that needs answering by the browser. The problem is that the browser has to try its best to display the character and it can only know about what to display if it find the character in one of the installed fonts. Roughly, this selection procedure goes as follows, if you consider the rules of CSS and HTML and the glyph selection process of most browsers:

  1. What is the current font (not font-family!)?
  2. Is the character available as glyph in that font?
  3. If not, are fallback fonts specified in CSS for this element?
  4. If not, what are the fonts of the parent(s)?
  5. If another (fallback or parent) font is found, does it have a glyph for the character?
  6. If not, is a generic font specified?
  7. What is the generic font-setting of the browser and does it have our glyph?
  8. If not, is there another font on the system that has the glyph?
  9. If not, is a downloadable font (not font-family!) available by that name with that glyph?
  10. If not, is it possible to normalize the character in any way and render it differently?
  11. If not, display the default “cannot render” character (usually a rectangular shape)

Step nine is specified in CCS2, but there is no browser that actually executes it (this is unrelated to @font-face for downloading webfonts).

Step 10 is unfortunately not implemented in any browser I know of. Unfortunate, because splitting é in ′ and e can make it easier to render. And nobody would complain about the ligature being rendered as ffl.

Once a particular glyph is found, the browser must load the glyph’s vectors and render it accordingly, trying to make it as legible as possible, possibly using the hints in the font definition and extra rendering beautifiers like ClearType or anti-aliasing. Those are usually optional and depend on settings in your browser or your operating system.

Selection between italic, oblique, bold and normal fonts

Whether or not a browser can switch to a different font of the current font-family depends on the requested type. The basic rule is simple: a more specific font may select from a less specific font, but a less specific font will not (try to) select from a more specific font. In other words, if a text is bold it will try to find the glyphs in bold and normal fonts, but not in bold italics. If the selected font is “normal” (i.e., not bold or italic) it will only search normal fonts, never an italic or bold font. But if the text is actually bold italic the font selection algorithm will search bold italic, italic, bold, normal fonts (in that order).

And what about oblique? In computing interfaces, oblique is rarely found, which is odd, because when we specify italic the rendering usually takes a slanted version of the font, which is the same as oblique, but which is not actual italic. In practice, with online typefaces, we hardly see any different rendering between oblique (slanted font) and italic (stylized form based on calligraphic handwriting). As it turns out, when an oblique version of the font is missing, the rendering engine will select italic, and vice versa.

Garamond Roman, Italic and Oblique

Garamond Roman, Italic and Oblique

The picture on the left shows the correct rendering for a Garamond Roman, Garamond Italic and Garamond Oblique (doesn’t exist, it is slanted 10°) if all three versions actually exist on your system. However, if one doesn’t exist, the nearest neighbor will be chosen. The picture is an adjusted version of this picture by Laug.

Browsers can decide to render a font slanted when the actual italic or oblique type is not available. I’m not aware of any browsers actually doing this, but it is common in DTP software like QuarkXPress, PageMaker or the newer InDesign.

Performance related strategies

If the above list of 10 actions would be done for each and every character individually, rendering a textual page would become painfully slow. The actual implementation is dependent per browser and browser version, but goes something like this:

  1. Load the CSS and the fonts specified for a given page, check if they’re Unicode compliant, skip if not
  2. Create a lookup table of the characters you encounter in the text, add the glyph specifics per character
  3. When rendering an actual glyph, keep a little bitmap in memory for quick retrieval when you encounter it again
  4. When a certain character cannot be found, postpone rendering, or let it be rendered by a separate execution thread which can do the more complex lookup in other (fallback) fonts.

This list is partially a guess and partially experience. I’ll hope to browse a bit through the render engine source code of several browsers to find out what actually goes on under the hood. For now, for this discussion, it is sufficient to know that the actual rendering is a rather involved process which is highly optimized to make the browser experience as smooth as possible.

Font substitution issues

The following list contains a couple of frequently asked questions and raised issues that I encountered during my research for writing this article. The answers are not cast in stone, but they reflect the current status of browser technology and behavior.

  1. What is the default font of my browser?
    A: you can find this in your Options or Settings, often on the General tab, but it depends per browser. If you want to know how the default font renders, simply create a little page without any CSS. It will be rendered with the default font. In general, a browser has three default fonts: serif, sans-serif and monospace.
  2. I see the page rendered in the default font, but I selected another, why is that?
    A: if this happens, most likely the font you selected does not exist on the machine where you browse with. It is also possible that you made a typo. If you use many uncommon characters in your text, the browser may have  chosen to replace the whole text with a font that contains all the glyphs, it does to to improve readability (Opera is known to try to do this).
  3. How do I know when the browser falls back to the default font?
    A: you don’t know precisely, but you can predict it. If the selected font is not available and the fallback fonts are not available either, the browser will fallback to the default font. If the glyph is not available in the current or any of the fallback fonts, an individual glyph can be extracted from the default font.
  4. The monospace font renders smaller then the surrounding font, why is that?
    A: this is actually a bug that occurs in Chrome 1-4, Safari 1-4 and Firefox 1.0-3.6 (incl. betas). I wrote a little report about this in case you like to know more: CSS font-family monospace renders inconsistently in Firefox and Chrome. If a certain character falls back to monospace, the diminished sizing does not occur.
  5. How do I know what font is selected for a particular glyph?
    A: this is the tricky part: you don’t know. The only way to find out so far is by using eye-comparison. This is not very easy and it is not very handy either. It means that you have to know what fonts the browser loads and in what order to find the right one and in short: we don’t know that.
  6. What about this Firebug, the Web Developer tools and the Font Finder, isn’t that all we need?
    A: no, it is not all you need. The problem with all these tools is that they will display the actual CSS style rule for a particular element. It is impossible to deduct from that information what the actual applied font or fonts were for that element. As of this moment, there has not been a plugin for Firefox that reveils this information.
  7. Is such plugin or tool available for Internet Explorer, Opera, Chrome, Konqueror or Safari?
    A: no, unfortunately to the best of my knowledge, such plugin is not available. Trial and error are our only friends so far.
  8. Can it happen that another font then the default font is selected as last resort?
    Khmer and Chinese in Firefox 3.0

    Khmer and Chinese in Firefox 3.0

    A: yes, this can happen, but as many things related to font selection strategy, it is hard to predict when and if it happens.

    You can see this for yourself by doing a little test. First check your current default font. If you are on Windows, it is normally  Times New Roman and the Unicode subset is pretty narrow.

    Khmer and Chinese in IE6 (w/o font)

    Khmer and Chinese in IE6 (w/o font)

    Then go to a page that contains characters that are normally not covered in standard fonts, for instance, the Khmer Wikipedia page on China is a nice start. It contains Latin, Hangul and Khmer characters, pretty possible that you see a whole bunch of squares.

    I installed the Code2000/2001/2002 fonts and behold, the page becomes legible. In the Firefox screenshot, the Latin text is Arial (default for that page in CSS), the Chinese text is in the Microsoft Yahei type (contains Simplified Chinese), and the small unreadable symbols are Code2000, which contains the little used Khmer symbols.

    Khmer and Chinese in IE7

    Khmer and Chinese in IE7

    Other browsers have more trouble with this. The IE6 screenshot is with and without the font installed and the IE7 screenshot shows what happens if you have a disobedient rendering engine (Trident, the rendering in IE has a hard time rendering this), but the font is available. For some strange reason, Internet Explorer 8 (compatibility view or not) renders this the same way as Internet Explorer 6: you see blocks, not text, even if the font is installed. Opera, Chrome, Safari and Firefox are the clear winners here in going that extra mile in making the page readable. I’ll dive deeper into this particular subject in a future post.

  9. I would like to tell my users to install the font, but only when they don’t have it, just like with plugins, how do I do it?
    A: there is no straightforward answer to this question, but in general, you cannot simply find out whether a certain font is available or not. On Lalit.org author Lalit Patel came with a nice solution of measuring the fonts: if they are not available, the sizes are the same as the parent element’s font. The code does not work in all situations (for instance, Times New Roman, which is available on my machine, is not found), on which I will explain in a future post, but it is a good starting point for finding out if a particular font is installed and then acting accordingly.
  10. I would like to give a specific message when the browser cannot render my page appropriately, how to do I do it?
    A: Unfortunately, this is far from trivial. Checking for the existence of a font is one thing, warning a user when one or more characters fail to be found in the user’s font collection or in your specified fonts is another. It might be technically possible to find this out for one particular glyph. But that would mean that you’d have to render each and every character in your document to find out whether one is missing.
  11. Is it possible to find out whether fallback to another font has happened?
    A: no, this is not possible, see nr 5. above.

Open issues and questions

At the moment of this writing, not everything was 100% clear to me and I still have some open ends in the story. However, I decided to “go live” with the story regardless, perhaps someone can shed some more light on the darker corners. In short, the following questions and issues are still unclear or uncovered:

  1. The difference in the font selection process between the browsers and possible rendering differences as a result (yes, difference exist, see above, but can we make that process more insightful and predictable?)
  2. Ways to find out what the actual fonts are that have been selected in any given text. Plugin would be nice, but some smart JavaScript tool would be even nicer.
  3. GetComputedStyle currently gives the CSS rule that’s applicable for an element. It is my reading of the specification that this should be the actually chosen style. That is, the applied font, not the CSS rule with a list of fonts. For Opera this works as expected, which is excellent. But no other browsers support it that way and simply return the CSS rule.
  4. It would be great to find some references on how the browsers really improve the rendering speed of text and how they really do font selection.

Call for help and information

If you come across my article and you happen to know anything more specific on the subjects here or if you happen to be part of the rendering engines team, it would be great to hear your comments on the article. With time I hope to represent a more thorough story on the font selection process. I’d expected to find this somewhere online but until now have failed to do so.

History

A little history of this story, as it will be updated regularly, you can check here to find out what has changed since your last visit:

2009-09-29 initial version, including questions and open ends, please help me expanding this!
2009-09-27 published a tiny draft, which I quickly took offline to prevent confusion

– Abel –

  • Sinepecunia

    Abel, mijn oplossing is nog simpeler: pdf.
    Ik denk dat veel meer bedrijven gewoon pdf’s kunnen rondsturen (eventuele fonts worden door Indesign meegestuurd) en qua content kun je in een pdf net zoveel kwijt als in pdf (zelf interactieve elementen als invul-formulieren., ingebedde video’s… etc… merk nu trouwens dat letters bij het typen dit stukje steeds verdwijnen, komt dat omdat ik Opera als browser gebruik?
    Je ouwe vriend Michel (kijk ook eens op http://www.breidablick.nl en dan ‘over breidablick en dan Gefjon, vervaardigd/ontworpen door ondergetekende… in de avonduren natuurlijk want dergelijk ‘vakmanschap’ is onbetaalbaar…
    U is in mijn hart.
    Michel

Get Adobe Flash player