Thursday, February 14, 2008

Invoking the gdb disassembler to disassemble words

Until now, I've had to go through a rather laborious process to look at the machine code generated by the Factor compiler. I'd ask Factor for the address of a word with word-xt, then attach a gdb instance to Factor, then run disassemble and look at the output. This wastes time, especially when I'm debugging major changes (like I am now). So I've cooked up a quick hack to avoid having to do this in the future.

The new code is in tools.disassembler. Note that for now, it only works on Unix. If I figure out how to get the current cygwin process ID from Factor (Factor doesn't link against cygwin.dll) then I can make it work with cygwin gdb too.

The source begins with the usual boilerplate:
USING: io.files io words alien kernel math.parser alien.syntax
io.launcher system assocs arrays sequences namespaces qualified
regexp ;
IN: tools.disassembler

We qualify the unix vocab since it has a write word which clashes with io:write, and we want to call the latter.

We communicate with gdb using files:
: in-file "gdb-in.txt" resource-path ;

: out-file "gdb-out.txt" resource-path ;

We cannot use pipes since there is a race condition there; gdb suspends the process while disassembling, so if the pipe fills up, then gdb hangs because Factor cannot read from the pipe since it is suspended.

We have a word which takes a pair of addresses or a word, and creates a gdb command for disassembling this object in the current process, it then writes these commands to the input file:
GENERIC: make-disassemble-cmd ( obj -- )

M: word make-disassemble-cmd
word-xt 2array make-disassemble-cmd ;

M: pair make-disassemble-cmd
in-file [
"attach " write
unix:getpid number>string print

"disassemble " write
[ number>string write bl ] each
] with-file-out ;

Then we write a word to invoke gdb:
: run-gdb ( -- lines )
+closed+ +stdin+ set
out-file +stdout+ set
[ "gdb" , "-x" , in-file , "-batch" , ] { } make +arguments+ set
] { } make-assoc run-process drop
out-file file-lines ;

We pass gdb the path name to the file we just saved, together with some switches.

Note that we close stdin so that if gdb attempts to read commands, it gets an EOF instead of hanging. We also redirect the output to our output file. Then we read the output file and return the results.

Finally, a couple of words to clean up the output; we filter everything that's not a line of disassembly (gdb loading messages, etc), and we convert tabs to spaces since the Factor UI doesn't display tabs:
: relevant? ( line -- ? )
R/ 0x.*:.*/ matches? ;

: tabs>spaces ( str -- str' )
[ dup CHAR: \t = [ drop CHAR: \s ] when ] map ;

Finally, we have the actual word that calls the above:
: disassemble ( word -- )
make-disassemble-cmd run-gdb
[ relevant? ] subset [ tabs>spaces ] map [ print ] each ;

1 comment:

Andy Hefner said...

ndisasm would've been handy here, with the caveat that it's only useful on x86.