Friday, February 20, 2009

Metric bounds versus image bounds

With some fonts, the font glyphs exceed the font metric bounds. I wasn't aware of this fact until recently, and because of this some text would be chopped off at the edges.

The right way to do things is to compute the image bounds of the text and use that for creating the OpenGL texture, only using metric bounds to actually position the text.

To compute image bounds with Core Text, you're meant to use CTLineGetImageBounds(), but this function takes a CTLine as well as a CGContext, which creates a bit of a chicken and egg problem, because I need the image bounds to create the context. The solution I came up with is to keep a dummy 1x1 pixel bitmap context around globally and use that for text measurement.

Getting the text position right was a bit tricky; I must thank Joe Groff for helping me get this right.

Now, CTLineGetImageBounds() returns a rectangle. The origin of the rectangle is the point at the intersection of the baseline and the caret, and the dimension is the size of the glyphs. So the texture size should be the size of the rectangle (with appropriate rounding to ensure your texture has integral dimension).

Inside the texture, the text position should be set to the negation of the image bounds origin, by calling CGContextSetTextPosition. The text position is relative to the CGContext's text co-ordinate system, which has the origin at the bottom-left corner by default.

Now, once you've rendered the texture, you need to compute the texture position. Factor's UI uses a co-ordinate system where the origin is the top-left corner. Suppose you want to diplay the text such that the baseline is at (0,ascent) where ascent is the font ascent, and the image bounds are im.{x,y,w,h}. Then the texture quad should be positioned at (im.x,ascent - im.h - im.y).

I glossed over the details of rounding the texture dimensions and texture position to integral co-ordinates; this is important and a bit tricky to get right, but you can look at the code after I merge it in.

After some pixel tweaking, everything works now. In the below screenshot, I've created an editor gadget that displays text with the Zapfino font, and set a one-pixel black border around it. Also note that the text selection and caret positioning doesn't mess up ligatures.

Thursday, February 19, 2009

A note about libdl functions on NetBSD and the "Service unavailable" error

On NetBSD, dlopen(), dlsym(), dlerror() and other such functions are implemented as stubs in libc, and the stubs just return an error, "Service unavailable". The ELF loader patches references to these symbols inside binaries to point at the real definitions. Trouble is, if you look up one of these symbols at runtime using dlsym() itself, it will return the address of the stub function and not the correct definition.

So amusingly enough, this won't work in Factor on NetBSD:
FUNCTION: void* dlopen ( char* path, int flags ) ;
"" dlopen .

Of course you'd never do the above, but it can come up legitimately if you try calling dlerror() via FFI and expect to get a useful error message for the most recent failed dynamic linker operation. Instead, you'll always get the undescriptive "Service unavailable".

Tuesday, February 17, 2009

Font metrics and baseline alignment

I implemented some algorithms for horizontally positioning text and graphics within lines in an aesthetically-pleasing manner. Until now, the Factor UI did all font rendering with FreeType and only really supports a small, fixed set of fonts, all from the Bitstream Vera family, which I bundle with Factor. These fonts have similar metrics and baselines and so the UI layout issues I describe below were not as severe, but were still noticable. Now that Factor is doing native font rendering on Mac OS X with Core Text, and soon Pango on X11 and Windows, I decided to address these issues properly instead of hacking around them. Doing so involved implementing some new layout algorithms and abstractions in the Factor UI.

This blog post is a continuation of my previous entry on texture caching; now that the technical details of rendering are done, its time to make the result look good.

I tried to make reasonable diagrams to illustrate some of the concepts here, but I didn't try too hard, and this is by no means an authoritative account of the topic. It is just a brain dump of my incomplete understanding of the subject.

Identifying the problem

Suppose you want to position some gadgets next to each other, but they use different fonts. If you use a shelf parent with default settings, you get something like this, which looks terrible, because the text does not line up:

In this particular case you can change the alignment of the shelf so that gadgets are aligned relative to the bottom edge instead of the top, by setting the align slot to 1, instead of the default of 0:

This happens to fix the above case, but now this will look stupid if one of the fonts has a bigger point size than the other:

Here is another example. We have two gadgets side by side, where the second one has a border around it. Notice that the two pieces of text don't look right next to each other; the second one looks a few pixels "off".

Clearly, a simplistic alignment strategy, where a single constant determines where each child is positioned in between the top and bottom edge, is insufficient for text. Instead, we have to look at the fonts being used and different measurements used by those fonts to make the above examples look right.

Font metrics

When text is rendered, the bottom of each glyph is aligned along a baseline. This is like the lines on note paper, but of course it's invisible. Some letters, like lowercase "j" in most fonts, will descend below the baseline.

In addition to the font's point size, and its height, there are a number of additional metrics which are important. They are,
  • The ascent, which is the distance from the baseline to the top edge of the tallest glyph.
  • The descent, which is the distance from the baseline to the bottom edge of the tallest glyph. Sometimes, the descent is taken to be a negative value, but I don't use this convention in Factor's code since Core Text doesn't either.
  • The x height, which is the height of a lower-case x.
  • The cap height, which is the cap height of an upper case Y.
  • The font height is just the sum of the ascent and descent.
  • The leading, which is the gap between lines of text.
  • The line height is ascent + descent + leading.
  • The em, which is the width of a lower-case m.
  • The en, which is half an em.

Here is a diagram showing some of the above:

To understand why the previous examples look bad, let's make the baselines and borders visible:

Notice that the baselines of the two gadgets don't line up. If they lined up, the result would look nice.

Baseline alignment

I added support for baseline alignment to shelf gadgets; you set the align slot to the special symbol +baseline+. Setting it to a number between 0 and 1 is still supported, and setting it to +baseline+ activates totally different logic in the implementation.

Two things change when a shelf has baseline alignment enabled; the preferred size calculation, the layout algorithm itself.

If a shelf does not have baseline alignment, then its preferred height is just the maximum of the preferred heights of all of its children. This no longer works with baseline alignment. For example, consider the case where two gadgets with different baselines are nested inside a shelf:

Clearly, the preferred height of the shelf exceeds the maximum of the heights of the two gadgets. Here is a pseudo-code algorithm for computing the height:
For every gadget g in the shelf,
set ascent[g] = b.baseline
set descent[g] = g.height - g.baseline

max_ascent = maximum(ascent)
max_descent = maximum(descent)

height = max_ascent + max_descent

The algorithm for layout is similar:
For every gadget g in the shelf,
set ascent[g] = b.baseline
set descent[g] = g.height - g.baseline

max_ascent = maximum(ascent)
max_descent = maximum(descent)

For every gadget g in the shelf,
set g.y = max_ascent - g.baseline

Note that if every gadget in the shelf has a baseline equal to its height, the descent will always be zero and the above algorithms degenerate to the standard shelf layout algorithm with align = 1.

Graphics baseline alignment

For lines containing a mix of text and graphics, where the graphics are icons with similar height to the text, I came up with a nice trick for positioning them in a pleasing manner. Trying to use the standard alignment for icons and text doesn't work; here are three different layouts, with alignment of 0, 1/2, and 1, respectively:

Note that I increased the font size to make the effect more noticable, but even at smaller font sizes that are closer to the icon size, the results look less than ideal. Using baseline alignment where the baseline of the image is equal to its height doesn't produce the right results either.

The trick with images, then, is to define a "graphics baseline" that runs horizontally at the y co-ordinate equal to half of the image's height. For text, the graphics baseline runs half-way between the cap height and baseline. Here is what it looks like:

Generalizing the algorithm in the previous section to support a graphics baseline is easy.

First, we allow the baseline and cap-height of a gadget to be set to some unspecified value, distinct from any number. This indicates that graphics baseline alignment should be used for this gadget.

Then, we compute layout for the text children first, followed by the graphics children, and position the graphics children so that their graphics baseline lines up with half of the cap height from the text children.

I can't be bothered writing out pseudocode; you can look at the Factor implementation after its checked in if you're interested.

Font metrics implementation

The fonts vocabulary now defines a metrics tuple:
TUPLE: metrics width ascent descent height leading cap-height x-height ;

Various words in the UI output metrics tuples:
  • font-metrics ( font -- metrics )
  • line-metrics ( font string -- metrics )

The Core Text implementation of these is completely trivial, since the required information can be obtained by a series of API calls, and I expect the eventual Pango implementation to be similar.

Baseline alignment implementation

The ui.baseline-alignment vocabulary contains all the code. To start with, it needs to be able to get the baseline and cap height for a gadget:
GENERIC: baseline ( gadget -- y )

M: gadget baseline drop f ;

GENERIC: cap-height ( gadget -- y )

M: gadget cap-height drop f ;

The default implementations return f for both, which means the graphics baseline will be used.

I implemented a generalized version of this algorithm as the measure-metrics word in the ui.baseline-alignment vocabulary. It outputs the new ascent and descent, and adding them together gives the height of the shelf.

Labels implement these generic words in the obvious way, by looking up font metrics. Borders, packs, paragraphs and other compound gadgets implement these operations by computing the baseline and cap height of their children and combining them in various ways.

The UI tries to use baseline alignment by default in as many places as it makes sense, so in most cases you do not need to worry about the details of this.

Here is some help text rendering demonstrating the text baseline alignment in the values table and the description paragraph:

The astute observer will notice the table lines are missing a pixel in the bottom-right corner; this is a recent regression that I need to look into.

Here is an example demonstrating baseline alignment of a text label with a text field next to it, as well as graphics baseline alignment, with checkboxes and radio buttons:

More about font metrics

I haven't talked about line spacing yet, because this part is tricky. There's more to it than just adding the leading between each line. I'll figure it out in the near future and do a write-up most likely.

Sunday, February 15, 2009

Implementing OpenGL texture caching and bitmap image support in the Factor UI

Last time, I talked about how I implemented Unicode text rendering using Apple's Core Text API. In this blog entry I'll discuss some implementation details I omitted last time, and I'll also talk about a new feature I added a few days ago to the new_ui branch: support for bitmap images.

Note that Core Text is OS X-specific, and Factor will only use it on that platform; on other platforms, it will use Pango to render text (and maybe Uniscribe on Windows, leaving Pango for X11 only). The image support is cross-platform and indeed is entirely written in Factor.

A general cache abstraction

With Core Text, you never operate on strings of text directly, instead you construct a CTLine object and interrogate it for metrics, or ask it to render itself to a Core Graphics context. To avoid constructing CTLines over and over again, and to expose a straightforward API that only involves strings, the UI caches the CTLine objects. Using MEMO: or the cache combinator with a hashtable is insufficient here, since I don't want to retain every CTLine forever. Instead, I want each CTLine to be disposed and removed from the cache if it is not used for a fixed period of time.

To handle this use case, along with another one that I describe later on in this post, I implemented a simple cache abstraction.

The new cache vocabulary defines a new assoc type which wraps an existing assoc, and has a single configuration parameter, max-age:
TUPLE: cache-assoc assoc max-age disposed ;

: <cache-assoc> ( -- cache )
H{ } clone 10 f cache-assoc boa ;

INSTANCE: cache-assoc assoc

The values in the underlying assoc are all instances of cache-value:
TUPLE: cache-entry value age ;

: <cache-entry> ( value -- entry ) 0 cache-entry boa ; inline

The age of a value starts at zero, and is incremented at fixed intervals. Whenever a key is looked up in the cache, the age is reset to zero. Because I don't want to expose the user of a cache to these cache-entry tuples (they're just implementation detail), the cache-assoc type implements at* to unwrap the cache entry before returning the value to the user. Also note that I reset the age here:
M: cache-assoc at* assoc>> at* [ dup [ 0 >>age value>> ] when ] dip ;

Storing an entry into the cache first checks to see if the cache has been disposed of yet or not, and then wraps the value in a cache entry before passing it down to the underlying assoc:
M: cache-assoc set-at
[ check-disposed ] keep
[ <cache-entry> ] 2dip
assoc>> set-at ;

Converting a cache to an alist unwraps each value:
M: cache-assoc >alist assoc>> [ value>> ] { } assoc-map-as ;

The remaining important assoc methods simply delegate to the underlying assoc:
M: cache-assoc assoc-size assoc>> assoc-size ;

M: cache-assoc clear-assoc assoc>> clear-assoc ;

Caches also support one additional operation: disposal. When you call dispose on a cache, all values in the cache are disposed and the cache cannot be used again:
M: cache-entry dispose value>> dispose ;

M: cache-assoc dispose*
[ values dispose-each ] [ clear-assoc ] bi ;

This is the important part, that makes caches useful for managing external resources such as OpenGL textures.

Now, here is a word which ages each entry, and deletes those entries whose age exceeds the cache's max-age slot value. It uses the assoc-partition combinator, which splits an assoc into two, sorting key/value pairs based on whether they satisfy a predicate or not.
: purge-cache ( cache -- )
dup max-age>> '[
[ nip [ 1+ ] change-age age>> _ >= ] assoc-partition
[ values dispose-each ] dip
] change-assoc drop ;

I cheat a little here; because assoc-partition iterates over all key/value pairs exactly once, I can perform side effects in the predicate quotation; I increment the age and then immediately check if it exceeds the maximum. Note the use of '[ syntax with the _ directive, which places the max-age exactly where its needed without having to pass it around on the stack explicitly. The values which exceed the maximum age -- these make up the first return value of assoc-partition -- are all disposed of.

This cache vocabulary, which is currently in the new_ui branch and will be merged into the trunk soon, demonstrates the Factor equivalent of the "decorator" OO design pattern.


Doug Coleman has been working on a Factor library for working with bitmap images for quite some time; you can find it in the repository, in the images vocabulary. Recently it received a major facelift, as well as support for TIFF images (including LZW compression) in addition to Windows BMP. All of this is written entirely in Factor, and Doug also plans on supporting PNG, GIF and JPEG images, also with pure Factor code. When complete, this library will be quite a showcase for implementing complex algorithms in a concatenative language.

Also, Joe Groff designed some nice icons for the Factor UI. To make use of both of these great contributions, I wrote some code which caches images in OpenGL textures and renders these textures.

The load-image word in the images.loader vocabulary defines a data type for images:
TUPLE: image dim component-order bitmap ;

The component-order slot is one of a series of singletons,
R16G16B16 R32G32B32 R16G16B16A16 R32G32B32A32 ;

the dim slot is a pair of integers (the width and the height) and the bitmap slot is a byte array with the image data, where each pixel's size and format is determined by component-order.

OpenGL textures

In my last blog post, I showed some ad-hoc code for rendering a Core Graphics bitmap to a texture. I refactored this code to be independent of Core Graphics and Core Text, and put it in the opengl.textures vocabulary. There is a new texture data type:
TUPLE: texture texture display-list disposed ;

To create a texture from an image, I have to pass a format and a type to OpenGL. A generic word maps component order singletons into OpenGL constants; the implementation is incomplete, but it suffices for now:
GENERIC: component-order>format ( component-order -- format type )

M: RGBA component-order>format drop GL_RGBA GL_UNSIGNED_BYTE ;
M: ARGB component-order>format drop GL_BGRA_EXT GL_UNSIGNED_INT_8_8_8_8_REV ;
M: BGRA component-order>format drop GL_BGRA_EXT GL_UNSIGNED_INT_8_8_8_8 ;

Using this word, I implemented a <texture> constructor word, which creates an OpenGL texture from a bitmap image, and wraps the relevant data into a tuple:
: <texture> ( image -- texture )
[ dim>> ]
[ bitmap>> ]
[ component-order>> component-order>format ]
tri make-texture
] [ dim>> ] bi
over make-texture-display-list f texture boa ;

The code that wraps the glTexImage2D call, as well as creating a display list that renders a textured quad, is very similar to what I showed in my previous post, just slightly more general, so I won't reproduce it here.

To be useful cache keys, textures must be disposable:
M: texture dispose*
[ texture>> delete-texture ]
[ display-list>> delete-dlist ] bi ;

High-level abstraction for images

The ui.images vocabulary builds on top of the lower-level libraries: images, images.loader, opengl.textures and to provide an easy-to-use, simple, high-level interface for displaying icons in the UI.

The relevant data type is an image path, which is just a string wrapped in a tuple,
TUPLE: image-name path ;

C: <image-name> image-name

A fundamental word takes an image path, and loads the image that it names, if it has not been loaded already:
MEMO: cached-image ( image-name -- image ) path>> load-image ;

Note that I'm not using a cache-assoc to cache the bitmaps themselves. This is because bitmaps are objects in the Factor heap, not external resources, and so they don't have a dispose method, hence a simple MEMO: word suffices.

The next step is implementing the image texture cache. I added a new images slot to world gadgets. A world is the top-level gadget inside a native OS window, and since each world has its own OpenGL context, it is natural to associate the image texture cache with a world.

: image-texture-cache ( world -- texture-cache )
[ [ <cache-assoc> ] unless* ] change-images images>> ;


The image-texture-cache word assumes that there is a dynamically-scoped variable named world holding a world gadget. This is the case inside the dynamic extent of the draw-gadget word, and so textures can only be cached and looked up while a gadget is being rendered; a reasonable restriction.

The rendered-image word looks up a bitmap image texture in the cache, and adds it if its not already present:
: rendered-image ( path -- texture )
world get image-texture-cache [ cached-image <texture> ] cache ;

Note the two levels of caching here. First, it looks for an available texture associated with an image path. If there is no texture, it looks for a cached bitmap image and makes a texture from that. If there is no cached bitmap image, then cached-image loads it by calling load-image inside images.loader.

Now, drawing an image named by an image path is easy; this word draws the image at the origin, and code which wants to draw it at another position can simply translate the model view matrix first:
: draw-image ( image-name -- )
rendered-image display-list>> glCallList ;

Getting cached image dimensions is easy too, and does not involve the texture cache, only the bitmap cache:
: image-dim ( image-name -- dim )
cached-image dim>> ;

Caching rendered Core Text lines

In the last post, I presented the with-bitmap-context combinator in the core-graphics vocabulary which created a Core Graphics bitmap context, rendered to it by calling a quotation, and output a byte array when finished. I refactored this combinator and renamed it to make-bitmap-image. Instead of outputting a raw byte array, it creates an image tuple, which wraps the byte array together with dimensions and a component order. This means that anyone who doesn't care about portability and wishes to use the core-graphics vocabulary directly can do so very easily, and render graphics to an image object which works with a number of other Factor libraries.

Indeed, by changing the core-text code to call make-bitmap-image, I was able to very easily hook up texture caching for rendered lines of text.
: rendered-line ( font string -- texture )
world get text-handle>> [ cached-line image>> <texture> ] 2cache ;

M: core-text-renderer draw-string ( font string -- )
rendered-line display-list>> glCallList ;

Again, the user benefits because they don't have to concern themselves with "lines" (strings with layout) or GL textures; they just draw strings whenever and everything works out behind the scenes.

On code re-use

I'm a big fan of avoiding redundant data types in the Factor library. Over the years, parts of the Factor UI which were UI-specific have been split off, cleaned up and generalized. We now have some very nice vocabularies, such as colors, fonts and images which do not depend on OpenGL or the UI, and are hopefully general enough that no Factor programmer will have to re-invent the concepts of fonts, colors and bitmap images again.

Also, as much as possible of the Factor UI's rendering code is now abstracted off into sub-vocabularies of the opengl vocabulary; these define many utility words and types, and for instance something like opengl.textures can be used independently of the Factor UI, if you're doing OpenGL rendering with some other toolkit.

The introduction of the cache-assoc abstraction and generalized texture caching and bitmaps has simplified the Core Text text rendering backend considerably. It is now only a couple of hundred lines of code. This will make implementing a Pango backend easier.

Icons for definitions

To spruce up Factor's vocabulary browser and code completion, I cooked up a little vocabulary which, given a word or vocabulary, outputs an appropriate icon. For words, there are many types of icons corresponding to different types of words, and for vocabularies the icons distinguish loaded, unloaded and runnable vocabs.

The definitions.icons vocabulary defines a generic word:
GENERIC: definition-icon ( definition -- path )

Since all the icons are in a single directory, and in TIFF format, I define a utility word which takes an icon file name without the extension and outputs a full pathname:
: definition-icon-path ( string -- string' )
"resource:basis/definitions/icons/" prepend-path ".tiff" append ;

Now, I'd want to define a bunch of methods,
M: predicate-class definition-icon drop "class-predicate-word" definition-icon-path ;
M: generic definition-icon drop "generic-word" definition-icon-path ;
M: macro definition-icon drop "macro-word" definition-icon-path ;

However this is too verbose. The only really important part of each method line is the class name, and the icon name, without quotes. Everything else is boilerplate. Well, this is Factor, and we can factor it out. There are many different ways to do this. You can write a parsing word, or you can use my "functor" hack, which is just a syntax sugar for a particularly simple class of parsing words. Here is a solution using parsing words:

scan-word \ definition-icon create-method
scan '[ drop _ definition-icon-path ]
define ; parsing


And here is a solution using functors:

FUNCTOR: define-icon ( class icon -- ) WHERE
M: class definition-icon drop icon definition-icon-path ;

: ICON: scan-word scan define-icon ; parsing


The functor actually expands into code that is very similar to the parsing word. Both are straightforward. The main difference is that the parsing word has to explicitly call the run-time equivalents of M:; create-method and define.

Note that I wrap the parsing word in a compilation unit << ... >> so that it is compiled before the other words in the file. This allows the parsing word to be used in the same file where it is defined; otherwise usages of ICON: would attempt to call a parsing word that wasn't compiled yet, which throws an error.

Now new icons are very easy to define,
ICON: predicate-class class-predicate-word
ICON: generic generic-word
ICON: macro macro-word
ICON: parsing-word parsing-word
ICON: primitive primitive-word
ICON: symbol symbol-word
ICON: constant constant-word
ICON: word normal-word
ICON: vocab-link unopen-vocab

For vocabularies, the situation is simpler,
M: vocab definition-icon
vocab-main "runnable-vocab" "open-vocab" ? definition-icon-path ;