Tuesday, February 17, 2009

Font metrics and baseline alignment

I implemented some algorithms for horizontally positioning text and graphics within lines in an aesthetically-pleasing manner. Until now, the Factor UI did all font rendering with FreeType and only really supports a small, fixed set of fonts, all from the Bitstream Vera family, which I bundle with Factor. These fonts have similar metrics and baselines and so the UI layout issues I describe below were not as severe, but were still noticable. Now that Factor is doing native font rendering on Mac OS X with Core Text, and soon Pango on X11 and Windows, I decided to address these issues properly instead of hacking around them. Doing so involved implementing some new layout algorithms and abstractions in the Factor UI.

This blog post is a continuation of my previous entry on texture caching; now that the technical details of rendering are done, its time to make the result look good.

I tried to make reasonable diagrams to illustrate some of the concepts here, but I didn't try too hard, and this is by no means an authoritative account of the topic. It is just a brain dump of my incomplete understanding of the subject.

Identifying the problem


Suppose you want to position some gadgets next to each other, but they use different fonts. If you use a shelf parent with default settings, you get something like this, which looks terrible, because the text does not line up:

In this particular case you can change the alignment of the shelf so that gadgets are aligned relative to the bottom edge instead of the top, by setting the align slot to 1, instead of the default of 0:

This happens to fix the above case, but now this will look stupid if one of the fonts has a bigger point size than the other:

Here is another example. We have two gadgets side by side, where the second one has a border around it. Notice that the two pieces of text don't look right next to each other; the second one looks a few pixels "off".

Clearly, a simplistic alignment strategy, where a single constant determines where each child is positioned in between the top and bottom edge, is insufficient for text. Instead, we have to look at the fonts being used and different measurements used by those fonts to make the above examples look right.

Font metrics


When text is rendered, the bottom of each glyph is aligned along a baseline. This is like the lines on note paper, but of course it's invisible. Some letters, like lowercase "j" in most fonts, will descend below the baseline.

In addition to the font's point size, and its height, there are a number of additional metrics which are important. They are,
  • The ascent, which is the distance from the baseline to the top edge of the tallest glyph.
  • The descent, which is the distance from the baseline to the bottom edge of the tallest glyph. Sometimes, the descent is taken to be a negative value, but I don't use this convention in Factor's code since Core Text doesn't either.
  • The x height, which is the height of a lower-case x.
  • The cap height, which is the cap height of an upper case Y.
  • The font height is just the sum of the ascent and descent.
  • The leading, which is the gap between lines of text.
  • The line height is ascent + descent + leading.
  • The em, which is the width of a lower-case m.
  • The en, which is half an em.

Here is a diagram showing some of the above:

To understand why the previous examples look bad, let's make the baselines and borders visible:

Notice that the baselines of the two gadgets don't line up. If they lined up, the result would look nice.

Baseline alignment


I added support for baseline alignment to shelf gadgets; you set the align slot to the special symbol +baseline+. Setting it to a number between 0 and 1 is still supported, and setting it to +baseline+ activates totally different logic in the implementation.

Two things change when a shelf has baseline alignment enabled; the preferred size calculation, the layout algorithm itself.

If a shelf does not have baseline alignment, then its preferred height is just the maximum of the preferred heights of all of its children. This no longer works with baseline alignment. For example, consider the case where two gadgets with different baselines are nested inside a shelf:

Clearly, the preferred height of the shelf exceeds the maximum of the heights of the two gadgets. Here is a pseudo-code algorithm for computing the height:
For every gadget g in the shelf,
set ascent[g] = b.baseline
set descent[g] = g.height - g.baseline

max_ascent = maximum(ascent)
max_descent = maximum(descent)

height = max_ascent + max_descent

The algorithm for layout is similar:
For every gadget g in the shelf,
set ascent[g] = b.baseline
set descent[g] = g.height - g.baseline

max_ascent = maximum(ascent)
max_descent = maximum(descent)

For every gadget g in the shelf,
set g.y = max_ascent - g.baseline

Note that if every gadget in the shelf has a baseline equal to its height, the descent will always be zero and the above algorithms degenerate to the standard shelf layout algorithm with align = 1.

Graphics baseline alignment


For lines containing a mix of text and graphics, where the graphics are icons with similar height to the text, I came up with a nice trick for positioning them in a pleasing manner. Trying to use the standard alignment for icons and text doesn't work; here are three different layouts, with alignment of 0, 1/2, and 1, respectively:

Note that I increased the font size to make the effect more noticable, but even at smaller font sizes that are closer to the icon size, the results look less than ideal. Using baseline alignment where the baseline of the image is equal to its height doesn't produce the right results either.

The trick with images, then, is to define a "graphics baseline" that runs horizontally at the y co-ordinate equal to half of the image's height. For text, the graphics baseline runs half-way between the cap height and baseline. Here is what it looks like:

Generalizing the algorithm in the previous section to support a graphics baseline is easy.

First, we allow the baseline and cap-height of a gadget to be set to some unspecified value, distinct from any number. This indicates that graphics baseline alignment should be used for this gadget.

Then, we compute layout for the text children first, followed by the graphics children, and position the graphics children so that their graphics baseline lines up with half of the cap height from the text children.

I can't be bothered writing out pseudocode; you can look at the Factor implementation after its checked in if you're interested.

Font metrics implementation


The fonts vocabulary now defines a metrics tuple:
TUPLE: metrics width ascent descent height leading cap-height x-height ;

Various words in the UI output metrics tuples:
  • font-metrics ( font -- metrics )
  • line-metrics ( font string -- metrics )

The Core Text implementation of these is completely trivial, since the required information can be obtained by a series of API calls, and I expect the eventual Pango implementation to be similar.

Baseline alignment implementation


The ui.baseline-alignment vocabulary contains all the code. To start with, it needs to be able to get the baseline and cap height for a gadget:
GENERIC: baseline ( gadget -- y )

M: gadget baseline drop f ;

GENERIC: cap-height ( gadget -- y )

M: gadget cap-height drop f ;

The default implementations return f for both, which means the graphics baseline will be used.

I implemented a generalized version of this algorithm as the measure-metrics word in the ui.baseline-alignment vocabulary. It outputs the new ascent and descent, and adding them together gives the height of the shelf.

Labels implement these generic words in the obvious way, by looking up font metrics. Borders, packs, paragraphs and other compound gadgets implement these operations by computing the baseline and cap height of their children and combining them in various ways.

The UI tries to use baseline alignment by default in as many places as it makes sense, so in most cases you do not need to worry about the details of this.

Here is some help text rendering demonstrating the text baseline alignment in the values table and the description paragraph:

The astute observer will notice the table lines are missing a pixel in the bottom-right corner; this is a recent regression that I need to look into.

Here is an example demonstrating baseline alignment of a text label with a text field next to it, as well as graphics baseline alignment, with checkboxes and radio buttons:

More about font metrics


I haven't talked about line spacing yet, because this part is tricky. There's more to it than just adding the leading between each line. I'll figure it out in the near future and do a write-up most likely.

3 comments:

Anonymous said...

Slava,

if you really want to get your head around this subject get a hold of and read Knuth's 5 volume set on typesetting. There is much in there that will give a greater understanding of the subject matter. Just leave the TeX part alone.

regards

Bruce Rennie
(God's Own Country Downunder)

Nikolai Weibull said...

Actually, at least for typesetting on computers, the width of an em is either specified in the font or is taken to be the point size of the same.

dh said...

There is a document formatting system called Lout (designed by Jeff Kingston), which is much more elegant in design than TeX is. It's a language consisting of some ten primitives, which might help you with your design efforts.