Thursday, January 29, 2009

Rendering Unicode text to an OpenGL texture using Mac OS X CoreText

This post is really a follow-up to my previous post where I bragged about Factor supporting Unicode font rendering on Mac OS X. After writing that post I realized that there is not much information online about the particular use-case I implemented: rendering text to an offscreen bitmap context, and then making it into an OpenGL texture. So I decided to write up a blog post collecting the gory details in one place. This information should be useful to programmers who use any language on Mac OS X, but all the code examples are in Factor.

The code examples below are not self-contained; they use the FFI bindings I defined in the core-graphics and core-text vocabularies in the new_ui branch in the Factor GIT repository. Factor's FFI is very easy to use, for example here are some Core Graphics functions:
FUNCTION: CGContextRef CGBitmapContextCreate (
void* data,
size_t width,
size_t height,
size_t bitsPerComponent,
size_t bytesPerRow,
CGColorSpaceRef colorspace,
CGBitmapInfo bitmapInfo
) ;

FUNCTION: void CGContextSetRGBFillColor (
CGContextRef c,
CGFloat red,
CGFloat green,
CGFloat blue,
CGFloat alpha
) ;

Once the FFI bindings are in place, a lot of higher-level wrapper code has to be written, because these are low-level, C APIs. The first step is creating a bitmap context; after that, I'll discuss creating the various Core Text data types and rendering strings to a bitmap context. Finally, I'll discuss creating an OpenGL texture from a bitmap context.

Creating a bitmap context


The first step is to create a CoreGraphics bitmap context using CGBitmapContextCreate. When creating a bitmap context, you supply a byte array to render to. There are two main caveats here.

First, the memory backing the context should not move while the context is active. This means in GC languages with copying collectors, such as Factor, you have to pin your byte array (Factor doesn't support this) or allocate it in unmanaged memory (using malloc or similar). This is of course obvious (if the GC moves the object, it has no way of notifying Core Graphics that it moved) but its something to watch out for when using an FFI from a high-level language.

The second caveat, that took me a long time to figure out, is that you have to use a specific bitmap format if you want sub-pixel font smoothing to be enabled on LCD screens. The bitmapInfo parameter to CGBitmapContextCreate specifies the format of the bitmap in memory. The code path for sub-pixel font smoothing is only used if the following two flags are set in the bitmap info:
  • kCGImageAlphaPremultipliedFirst - put the alpha component first in each quadruple
  • kCGBitmapByteOrder32Host - use native endian to pack the ARGB quads, instead of the default which seems to be big-endian.

If these two values are not passed, then sub-pixel smoothing will not be enabled even if the user is using an LCD monitor and has requested that it be performed.

Below is the Factor code for creating a bitmap context, with some comments.

First, we make a word which constructs the flags mentioned above:
<PRIVATE

: bitmap-flags ( -- flags )
{ kCGImageAlphaPremultipliedFirst kCGBitmapByteOrder32Host } flags ;

Now, a word to compute the size of the bitmap in memory. This is used twice, when malloc'ing the bitmap, and after rendering, when copying it back to managed memory:
: bitmap-size ( dim -- n )
product "uint" heap-size * ;

Now, a word to malloc the bitmap. We add it the innermost destructor scope using &free; this ensures its deallocated when we're done with it, either after successfully rendering our text, or if an error is thrown:
: malloc-bitmap-data ( dim -- alien )
bitmap-size malloc &free ;

We also have to pass a color space; the default RGB space is sufficient for my purposes, and once again we define a destructor:
: bitmap-color-space ( -- color-space )
CGColorSpaceCreateDeviceRGB &CGColorSpaceRelease ;

Now, a word which uses the above to create a context backed by unmanaged memory:
: <CGBitmapContext> ( dim -- context )
[ malloc-bitmap-data ] [ first2 8 ] [ first 4 * ] tri
bitmap-color-space bitmap-flags CGBitmapContextCreate
[ "CGBitmapContextCreate failed" throw ] unless* ;

Once we're done rendering, we have to get the bitmap data from the context and copy it to managed memory, so that we can return a byte array object to the user. This is done with the below word:
: bitmap-data ( bitmap dim -- data )
[ CGBitmapContextGetData ] [ bitmap-size ] bi*
memory>byte-array ;

PRIVATE>

This ends the implementation detail. The last word here is a high-level word that is called by other code. This word completely hides the resource allocation and deallocation by establishing a destructor scope, creating a bitmap context, calling a quotation with the bitmap context, then copying the bitmap data back to managed memory and returning a byte array. All the external resources allocated during this operation -- the memory buffer, the color space, the context -- are deallocated for us as soon as with-destructors returns:
: with-bitmap-context ( dim quot -- data )
[
[ [ <CGBitmapContext> &CGContextRelease ] keep ] dip
[ nip call ] [ drop bitmap-data ] 3bi
] with-destructors ; inline

Now, higher-level code can simply wrap some Core Graphics rendering calls in a with-bitmap-context, and receive a Factor byte array as the result.

Creating a CTFont


Core Text defines the CTFont opaque type to represent a specific instance of a font with a style and size applied. I create CTFont instances with CTFontCreateWithName, and then apply bold and italic styles by calling CTFontCreateCopyWithSymbolicTraits. There is another API to do this in one shot, CTFontCreateWithFontDescriptor, but for some reason it ignores any set symbolic traits. Perhaps its a bug, or operator error, but it took me a lot of futzing around to get Core Text to respect my choice in font style. Another caveat to watch out for is that CTFontCreateCopyWithSymbolicTraits returns null if the traits could not be applied; for example, Apple's "Monaco" font does not have a bold variant, so if the user requests bold Monaco, you just have to return the original Monaco without the traits. There might be a way to synthetically increase the font weight, but I haven't discovered it yet.

Creating a CTLine from a string and a font


One of the central abstractions in Core Text is a CTLine. You create a CTLine from a CFAttributedString with CTLineCreateFromAttributedString, and render it to a CGContext with CTLineDraw.

Creating the CFAttributedString is an adventure in itself. For now, I'm only interested in rendering a string with a single font style and color, so I can use the CFAttributedStringCreate function to create the attributed string. It takes a CFString and CFDictionary. The dictionary can contain a number of keys, the two I'm using are kCTFontAttributeName and kCTForegroundColorAttributeName. These are global variables holding CFString references. Factor's FFI doesn't have a way to access global variables right now, so I made a quick hack specific to the core-text vocabulary. It lets you access global vars that hold void* values. Soon I'll generalize it and put it in alien.syntax, but for now it can live as a private utility in core-text:
: C-GLOBAL:
CREATE-WORD
dup name>> '[ _ f dlsym *void* ]
(( -- value )) define-declared ; parsing

Using this utility is easy:
C-GLOBAL: kCTFontAttributeName
C-GLOBAL: kCTForegroundColorAttributeName

Because I found myself constructing lots and lots of CF types: strings, numbers, dictionaries, colors, and so on, I made a utility word, >cf which converts a Factor object into a Core Foundation object. Using this word, I can define a <:CFAttributedString> that constructs a CFAttributedString from a Factor string and assoc:
: <CFAttributedString> ( string assoc -- alien )
[
[ >cf &CFRelease ] bi@
[ kCFAllocatorDefault ] 2dip CFAttributedStringCreate
] with-destructors ;

Again, notice the destructor usage: we want to release the string and dictionary after we return, since the attributed string retains them for us.

Now, we can create the CTLine, finally. Notice how we take a Factor string, a CTFont, and a color, and make the attributes as a Factor assoc; the <CFAttributedString> word takes care of converting it to Core Foundation types. And again, we use destructors for deterministic resource cleanup:
: <CTLine> ( string font color -- line )
[
[
kCTForegroundColorAttributeName set
kCTFontAttributeName set
] H{ } make-assoc <CFAttributedString> &CFRelease
CTLineCreateWithAttributedString
] with-destructors ;

Rendering a CTLine to a bitmap context


Now we can put some things together and write a word which renders a Unicode string to a Factor byte array.

First, we make a data type to hold the CTLine, the rendered bitmap, the typographic bounds, and some other things:
TUPLE: line font line bounds dim bitmap age refs disposed ;

The refs and age slots are used for caching lines so that the UI doesn't have to render them over and over again; I'll discuss this in another blog post, maybe.

Now, we have some code which computes the typographic bounds of a CTLine, and packages the values up into a Factor tuple; there's also a word to compute the size of the bitmap to create. This doesn't take leading into account, I haven't decided how to handle that yet:
TUPLE: typographic-bounds width ascent descent leading ;

: line-typographic-bounds ( line -- typographic-bounds )
0 <CGFloat> 0 <CGFloat> 0 <CGFloat>
[ CTLineGetTypographicBounds ] 3keep [ *CGFloat ] tri@
typographic-bounds boa ;

: bounds>dim ( bounds -- dim )
[ width>> ] [ [ ascent>> ] [ descent>> ] bi + ] bi
[ ceiling >fixnum ]
bi@ 2array ;

Now that we have the above utilities, as well as all the code defined earlier, we can write a <line> constructor word which constructs a line tuple. Part of the construction involves rendering the text to a bitmap, and storing the resulting byte array in the line's bitmap slot. We're going to use the with-bitmap-context word written above here. There are a few steps to rendering a CTLine to a bitmap context:
  • First, you must fill the context with a background color, using CGContextSetRGBFillColor and CGContextFillRect. Leaving it transparent will disable sub-pixel rendering, so the line constructor takes a background color as a parameter.
  • You must set the text position with CGContextSetTextPosition.
  • Finally, you can render the text with CTLineDraw.

This word manipulates a lot of intermediate values, makes a fair number of Core Graphics calls, and has complex data flow, so I chose to implement it using locals. Notice the [let* "serial-binding" form; successive definitions can reference previously-defined values:
:: <line> ( string font foreground background -- line )
[
[let* | font [ font CFRetain |CFRelease ]
line [ string font foreground <CTLine> |CFRelease ]
bounds [ line line-typographic-bounds ]
dim [ bounds bounds>dim ] |
dim [
{
[ background >rgba-components CGContextSetRGBFillColor ]
[ 0 0 dim first2 <CGRect> CGContextFillRect ]
[ 0 bounds descent>> CGContextSetTextPosition ]
[ line swap CTLineDraw ]
} cleave
] with-bitmap-context
[ font line bounds dim ] dip 0 0 f
]
line boa
] with-destructors ;

Notice how it uses "on-error" destructors, named |foo. This means that if the construction fails for whatever reason, the CTLine and CTFont objects are released, but if construction succeeds, they are retained. This is done so that we can hold on to these objects and store them in the new tuple's slots. The tuple itself has a dispose* method which releases the font and line objects:
M: line dispose* [ font>> CFRelease ] [ line>> CFRelease ] bi ;

The Factor UI uses all the slots in the line object to perform operations such as measuring text, caching textures, and so on. For the purpose of this blog post, the only interesting slots are dim and bitmap. The bitmap slots holds a byte array with the rendered text, in the format we specified when creating the bitmap context; ARGB with native byte order.

Creating an OpenGL texture from a Core Graphics bitmap


Standard OpenGL does not support ARGB bitmaps, however Apple's implementation supports the GL_BGRA_ext extension. To use this extension, I had to add a couple of constants to our opengl.gl vocabulary:
! GL_BGRA_ext: http://www.opengl.org/registry/specs/EXT/bgra.txt
CONSTANT: GL_BGR_EXT HEX: 80E0
CONSTANT: GL_BGRA_EXT HEX: 80E1

Now, we can pass CL_BGRA_EXT to glTexImage2D. Here is a utility word to ease creation of textures. Since glTexImage2D takes 9 parameters (!) I use locals so that I can name them:
:: make-texture ( dim pixmap format type -- id )
gen-texture [
GL_TEXTURE_BIT [
GL_TEXTURE_2D swap glBindTexture
GL_TEXTURE_2D
0
GL_RGBA
dim first2
0
format
type
pixmap
glTexImage2D
] do-attribs
] keep ;

Once the texture has been created, we can display the text on the screen by binding the texture and rendering a quad. It is also important to call glEnable with GL_TEXTURE_2D, otherwise the quad won't be textured. I forgot this the first time and spent some 20 minutes looking at my code -- OpenGL is very picky like that.

Keen readers will observe that I make no attempt to ensure the texture dimensions are powers of 2. This is because all Macs capable of running OS X 10.5 support the GL_ARB_texture_non_power_of_two, so at least in this case, you don't have to worry about padding the bitmap out anymore.

I wrap all of this in a display list so that a previously-rendered piece of text can be displayed anywhere by performing a translation and calling the display list. Here is the code that does this:
: make-line-display-list ( rendered-line texture -- dlist )
GL_COMPILE [
GL_TEXTURE_2D [
GL_TEXTURE_BIT [
GL_TEXTURE_COORD_ARRAY [
white gl-color
GL_TEXTURE_2D swap glBindTexture
init-texture rect-texture-coords
dim>> fill-rect-vertices (gl-fill-rect)
GL_TEXTURE_2D 0 glBindTexture
] do-enabled-client-state
] do-attribs
] do-enabled
] make-dlist ;

All the rest


I won't discuss the details of the remaining parts of the font rendering implementation here, but briefly, they are:
  • A texture cache where frequently-used textures are held to avoid rendering the same text over and over again; text expires from the cache if its not used for five consecutive rendering runs
  • Code for measuring text, converting logical positions to screen co-ordinates, and vice versa
  • The high-level platform-independent text API in Factor's UI, and the mapping between this API and the abstractions I described here
  • The FreeType implementation of the high-level layer, for use on other platforms

If you want to dig deeper, feel free to peruse the core-text, core-graphics, ui.text and ui.text.core-text vocabularies in the new_ui branch in the GIT repository.

Appendix: alien destructors


For background about Factor's destructors, see the following:

With destructors, you can wrap some code in a with-destructors scope, and call &dispose on external resources; when the with-destructors form returns, dispose is called on all the resources marked for disposal with &dispose. This is very handy for working with such things as Factor streams, database connections, and so on, but its a problem for bare C references; they're all instances of alien, and there's no meaningful dispose method that works on all of them. Our idiom until recently has been to wrap the alien in a one-tuple slot, which defines a custom dispose method, but this is just boilerplate. Factor supports meta-programming, and this could be taken care of with parsing words and macros; but the functors abstraction I implemented a while ago, which basically gives you a very easy way to implement a certain type of parsing word, works really well here, and it becomes completely trivial to abstract out the creation of the one-slot class with a custom dispose method. The alien.destructors vocabulary defines a single parsing word, DESTRUCTOR:. Here is an example of its use from the core-foundation vocabulary; first we define an FFI binding to the Core Foundation CFRelease function, then we define it to be a destructor:
FUNCTION: void CFRelease ( CFTypeRef cf ) ;

DESTRUCTOR: CFRelease

The DESTRUCTOR: word defines two new words for us; &CFRelease and |CFRelease. You can see examples of their usage previously in this post. Here is the implementation of DESTRUCTOR: -- I won't go into the details, but you can see it looks like a template of what you'd write out every time if you didn't have the abstraction:
FUNCTOR: define-destructor ( F -- )

F-destructor DEFINES ${F}-destructor
<F-destructor> DEFINES <${F}-destructor>
&F DEFINES &${F}
|F DEFINES |${F}

WHERE

TUPLE: F-destructor alien disposed ;

: <F-destructor> ( alien -- destructor ) f F-destructor boa ; inline

M: F-destructor dispose* alien>> F ;

: &F ( alien -- alien ) dup <F-destructor> &dispose drop ; inline

: |F ( alien -- alien ) dup <F-destructor> |dispose drop ; inline

;FUNCTOR

: DESTRUCTOR: scan-word define-destructor ; parsing

Functors are just a hack and everything they do can be achieved using parsing words, but the code with functors, in the cases that they can handle, is simpler and more declarative.

No comments: