I'm proud to announce a major new version of
extra/webapps/pastebin
. I've deployed
the new pastebin on an experimental HTTP server. I hope it doesn't crash by the time you read this. The major new feature is syntax highlighting support.
For the last two weeks or so I've been working on a Factor port of
jEdit's syntax highlighting engine. For those who have never used jEdit, syntax highlighting rules are specified in XML files. Rules either involve literal or regular expression matching. Factor already had an XML parser which works pretty well, but regexps were missing.
I've been working with Doug to implement a new regexp library; he will blog about it in the next few days. It suffices to say that it is coming along very well, and the code is very concise and readable, as is the norm with Factor. We're using Chris Double's parser-combinators library to parse the regexp itself, then construct parser combinators from the regexp.
There is quite a bit of history behind the jEdit syntax highlighting engine. jEdit 1.2, released in late 1998, was the first release to support syntax highlighting. It featured a small number of hand-coded "token markers" -- simple incremental parers -- all based on the original JavaTokenMarker contributed by
Tal Davidson.
Around the time of jEdit 1.5 in 1999,
Mike Dillon began developing a jEdit plugin named "XMode". This plugin implemented a generic, rule-driven token marker which read mode descriptions from XML files. XMode eventually matured to the point where it could replace the formerly hand-coded token markers.
With the release of jEdit 2.4, I merged XMode into the core and eliminated the old hand-coded token markers.
As you can guess from the age of the code, many design decisions date back to Java 1.1, a long-forgotten time when Java VMs with JIT compilers were relatively uncommon, object allocation was expensive, and heap space tight. As a result the parser is basically procedural spaghetti code, with lots of mutable state, the sort that gives Haskell programmers nightmares. So while the code is ugly and the parser design is archaic, the huge advantage is that there is a large suite of contributed modes ready to go.
I expected it would be a pain to port to Factor, but this hasn't been the case. Even though I kept the original design, warts and all, the resulting Factor code is pretty decent. I intend to refactor it to use a more modern design and take advantage of Factor's abstraction features instead of just representing parser state with an bunch of integer and boolean-valued variables. I also need to optimize it more.
Because XMode is an incremental parser, it would be easy to cook up an editor gadget with support for syntax highlighting in the UI. I'll probably do that at some point. Then it will be easy for a contributor to build a text editor in Factor, and we will have come full circle. :-)
Being able to implement a syntax highlighting pastebin in Factor -- one that manages to reuse jEdit mode files no less -- is a milestone. It means we can do XML, regular expressions, web applications, and not to mention the various minor bits and pieces which support these components, such as date/time support, various parsers, and so on. We have different people working on different libraries, and everyone is able to understand and reuse other people's work. It only took a handful of contributors to build all this infrastructure.