I added efficient arrays of integers and floats to Factor. We've had

byte-arrays and

float-arrays for a while; this code is essentially a generalization of these to every primitive C type: chars, shorts, ints, longs, long longs, unsigned version of all of these, as well as floats and doubles.

### Specialized arrays

For each primitive C type, there is a vocabulary named

`specialized-arrays.`*type*

, which defines a class named

*type*-array

, and several related words: a constructor, and a conversion word for converting existing sequences to specialized arrays of this type. As with all sequence types, the class implements methods for

`length`

,

`nth`

and

`set-nth`

; there is also a literal syntax for parsing and printing specialized arrays. An example:

**USE:**** ****specialized-arrays.int**

int-array{ 584 -13 11 } [ 10 + ] map

### Specialized vectors

There are also resizeable versions of all of these, in

`specialized-vectors.`*type*

. These work like ordinary vectors; you can use words such as

`push`

and

`pop`

, accumulating a stream of values in a vector.

### Direct arrays

A third variant is also provided. These are the "direct specialized arrays". Sometimes when working with a C library, the library hands you a pointer to some data which is to be viewed as an array of primitive values. Direct arrays accomplish this. For every primitive C type, the

`specialized-arrays.direct.`*type*

vocabulary provides a word named

`<direct-`*type*-array>

which takes an alien pointer and a length. It wraps them in a tuple which responds to the sequence protocol by reading from memory.

### Memory-mapped files

Direct arrays are useful in another context, and that is when working with memory-mapped files. Again, for every primitive C type, there is a vocabulary,

`io.mmap.`*type*

. It defines two words,

`<mapped-`*type*-array>

and

`with-mapped-`*type*-file

. Here is an example usage of the latter: we're mapping in a file of floating point numbers and adding them up:

[ 0.0 [ + ] reduce ] with-mapped-float-file

We've had

`io.mmap`

for a while, but specialized arrays make this library a lot more useful.

This is all really powerful stuff, because on one hand, we have these high-level data types that behave like any other Factor sequence and can be used interchangeably with sequence utility words, but on the other hand, not only do they offer a space savings over heterogeneous

arrays by avoiding boxing, they also have the potential to improve performance -- the low-level primitives they're built on are already optimized by the compiler, and it is possible to write code which operates on specialized float and integer arrays without any boxing at all.

### Mersenne Twister

A good example of an algorithm which benefits from specialized arrays is the

Mersenne Twister random number generator. The

Factor implementation is very straightforward, and shorter than the

C version. I changed it to use a

`uint-array`

instead of a normal

`array`

to store the random generator state. After making a few more tweaks, the time taken to generate a 100,000,000-element byte array of random bytes on my Mac Book Pro went down to 5 seconds, from 37 seconds!

Here is the main method of the Mersenne Twister,

`random-32*`

, and its surrounding utility words:

**:**** ****mt-n** 624 **;** inline

**:**** ****mt-m** 397 **;** inline

**:**** ****mt-a** HEX: 9908b0df **;** inline

**:**** ****mersenne-wrap** ( **n**** **--** ****n****'**** **)

dup mt-n > [ mt-n - ] when **;** inline

**:**** ****wrap-nth** ( **n**** ****seq**** **--** ****obj**** **)

[ mersenne-wrap ] dip nth-unsafe **;** inline

**:**** ****set-wrap-nth** ( **obj**** ****n**** ****seq**** **--** **)

[ mersenne-wrap ] dip set-nth-unsafe **;** inline

**:**** ****calculate-y** ( **n**** ****seq**** **--** ****y**** **)

[ wrap-nth 31 mask-bit ]

[ [ 1+ ] [ wrap-nth ] bi* 31 bits ] 2bi bitor **;** inline

**:**** ****(mt-generate)** ( **n**** ****seq**** **--** ****next****-****mt**** **)

[

calculate-y

[ 2/ ] [ odd? mt-a 0 ? ] bi bitxor

] [

[ mt-m + ] [ wrap-nth ] bi*

] 2bi bitxor **;** inline

**:**** ****mt-generate** ( **mt**** **--** **)

[

mt-n swap seq>> '[

_ [ (mt-generate) ] [ set-wrap-nth ] 2bi

] each

] [ 0 >>i drop ] bi **;** inline

**:**** ****mt-temper** ( **y**** **--** ****yt**** **)

dup -11 shift bitxor

dup 7 shift HEX: 9d2c5680 bitand bitxor

dup 15 shift HEX: efc60000 bitand bitxor

dup -18 shift bitxor **;** inline

**:**** ****next-index** ( **mt**** **--** ****i**** **)

dup i>> dup mt-n < [ nip ] [ drop mt-generate 0 ] if **;** inline

M: mersenne-twister random-32* ( **mt**** **--** ****r**** **)

[ next-index ]

[ seq>> wrap-nth mt-temper ]

[ [ 1+ ] change-i drop ] tri **;**

A naive implementation of Factor would struggle with this code because of method dispatch overhead (all arithmetic and sequence access is generic), and because of overflow checks on integer arithmetic (integers have arbitrary precision in Factor). Thankfully, the Factor implementation is not naive. The high-level optimizer, applied to the above code, produces the following output -- note that the translation from high-level IR back to Factor code is imprecise, and some IR nodes cannot be expressed, hence the

`"COMPLEX SHUFFLE"`

strings that appear here:

[

dup >r dup 3 slot dup 624 fixnum<

[ nip ] [

drop dup >r 624 swap 2 slot

>r r> >r r> 0 -rot **\**** ****(** gensym ) [

pick pick fixnum< [

swap >r 2dup "COMPLEX SHUFFLE" >r r> 2dup

"COMPLEX SHUFFLE" 2dup "COMPLEX SHUFFLE" 2dup

"COMPLEX SHUFFLE" >r r> 3 slot

swap 4 fixnum*fast

alien-unsigned-4 2147483648 fixnum-bitand

"COMPLEX SHUFFLE" >r 1 fixnum+fast

r> >r r> 3 slot swap 4 fixnum*fast

alien-unsigned-4 2147483647 fixnum-bitand

fixnum-bitor dup >r -1 fixnum-shift-fast

r> 1 fixnum-bitand 1 eq? 2567483615 0 rot

[ drop ] [ nip ] if fixnum-bitxor

"COMPLEX SHUFFLE" >r 397 fixnum+fast

r> >r dup 624 fixnum>

[ 624 fixnum-fast ] [ ] if r> 3 slot

swap 4 fixnum*fast

alien-unsigned-4 fixnum-bitxor

"COMPLEX SHUFFLE" >r r> 3 slot

swap 4 fixnum*fast

set-alien-unsigned-4 "COMPLEX SHUFFLE" r> swap

"COMPLEX SHUFFLE" 1 fixnum+fast

"COMPLEX SHUFFLE" ( **gensym**** **)

] [ 3drop ] if

] label r> >r 0 r> 3 set-slot 0

] if r> dup >r 2 slot >r r> 3 slot swap 4 fixnum*fast

alien-unsigned-4 dup -11 fixnum-shift-fast fixnum-bitxor

dup 7 fixnum-shift-fast 2636928640 fixnum-bitand

fixnum-bitxor dup 15 fixnum-shift-fast

4022730752 fixnum-bitand fixnum-bitxor

dup -18 fixnum-shift-fast fixnum-bitxor r> dup >r 3 slot

1 fixnum+fast r> swap swap >r r> 3 set-slot

]

Only primitive calls remain; everything else has evaporated after 11 distinct optimization passes. A lot of redundant stack shuffling still remains; the low-level optimizer eliminates that and produces a sequence of low-level IR instructions:

=== word: mersenne-twister=>random-32*, label: mersenne-twister=>random-32*

_label 0

##prologue

_label 1

##inc-r 1

##inc-d 1

##peek V int-regs 1368412 D 1

##replace V int-regs 1368412 R 0

##slot-imm V int-regs 1368417 V int-regs 1368412 3 2

##replace V int-regs 1368417 D 0

_compare-imm-branch 3 V int-regs 1368417 4992 cc>=

_label 2

##inc-d -1

##peek V int-regs 1368425 D -1

##replace V int-regs 1368425 D 0

_branch 13

_label 3

##inc-r 1

##inc-d 1

##peek V int-regs 1368427 D 2

##replace V int-regs 1368427 R 0

##load-immediate V int-regs 1368429 4992

##slot-imm V int-regs 1368434 V int-regs 1368427 2 2

##load-immediate V int-regs 1368439 0

##replace V int-regs 1368434 D 0

##replace V int-regs 1368429 D 1

##replace V int-regs 1368439 D 2

##loop-entry

_label 4

##peek V int-regs 1368443 D 2

_compare-imm-branch 11 V int-regs 1368443 4992 cc>=

_label 5

##inc-r 7

##peek V int-regs 1368453 D 1

##peek V int-regs 1368454 D 0

##replace V int-regs 1368453 R 6

##peek V int-regs 1368456 D 2

##replace V int-regs 1368454 R 4

##replace V int-regs 1368456 R 5

##replace V int-regs 1368454 R 2

##replace V int-regs 1368456 R 3

##replace V int-regs 1368454 R 0

##replace V int-regs 1368456 R 1

##slot-imm V int-regs 1368478 V int-regs 1368454 3 2

##copy V int-regs 1368483 V int-regs 1368456

##shl-imm V int-regs 1368483 V int-regs 1368483 2

##copy V int-regs 1368486 V int-regs 1368483

##sar-imm V int-regs 1368486 V int-regs 1368486 3

##add-imm V int-regs 1368487 V int-regs 1368478 13

##add V int-regs 1368489 V int-regs 1368486 V int-regs 1368487

##alien-unsigned-4 V int-regs 1368490 V int-regs 1368489

##copy V int-regs 1368491 V int-regs 1368490

##shl-imm V int-regs 1368491 V int-regs 1368491 3

##load-immediate V int-regs 1368492 17179869184

##copy V int-regs 1368495 V int-regs 1368491

##and V int-regs 1368495 V int-regs 1368495 V int-regs 1368492

##add-imm V int-regs 1368501 V int-regs 1368456 8

##copy V int-regs 1368512 V int-regs 1368501

##shl-imm V int-regs 1368512 V int-regs 1368512 2

##copy V int-regs 1368515 V int-regs 1368512

##sar-imm V int-regs 1368515 V int-regs 1368515 3

##add V int-regs 1368518 V int-regs 1368515 V int-regs 1368487

##alien-unsigned-4 V int-regs 1368519 V int-regs 1368518

##copy V int-regs 1368520 V int-regs 1368519

##shl-imm V int-regs 1368520 V int-regs 1368520 3

##load-immediate V int-regs 1368521 17179869176

##copy V int-regs 1368524 V int-regs 1368520

##and V int-regs 1368524 V int-regs 1368524 V int-regs 1368521

##copy V int-regs 1368527 V int-regs 1368495

##or V int-regs 1368527 V int-regs 1368527 V int-regs 1368524

##copy V int-regs 1368532 V int-regs 1368527

##sar-imm V int-regs 1368532 V int-regs 1368532 4

##copy V int-regs 1368533 V int-regs 1368532

##shl-imm V int-regs 1368533 V int-regs 1368533 3

##replace V int-regs 1368533 D 2

##copy V int-regs 1368537 V int-regs 1368527

##and-imm V int-regs 1368537 V int-regs 1368537 8

##load-immediate V int-regs 1368542 20539868920

##load-immediate V int-regs 1368543 0

##replace V int-regs 1368543 D 0

##replace V int-regs 1368542 D 1

_compare-imm-branch 7 V int-regs 1368537 8 cc/=

_label 6

##inc-d -1

_branch 8

_label 7

##inc-d -1

##peek V int-regs 1368550 D -1

##replace V int-regs 1368550 D 0

_label 8

##inc-r -1

##peek V int-regs 1368551 D 1

##peek V int-regs 1368552 D 0

##copy V int-regs 1368553 V int-regs 1368551

##xor V int-regs 1368553 V int-regs 1368553 V int-regs 1368552

##replace V int-regs 1368553 D 1

##peek V int-regs 1368554 R 0

##peek V int-regs 1368555 R -1

##add-imm V int-regs 1368559 V int-regs 1368554 3176

##replace V int-regs 1368555 R 0

##replace V int-regs 1368559 D 0

_compare-imm-branch 10 V int-regs 1368559 4992 cc<=

_label 9

##peek V int-regs 1368569 D 0

##sub-imm V int-regs 1368570 V int-regs 1368569 4992

##replace V int-regs 1368570 D 0

_label 10

##inc-r -6

##inc-d 1

##peek V int-regs 1368571 R -6

##slot-imm V int-regs 1368574 V int-regs 1368571 3 2

##peek V int-regs 1368575 D 1

##copy V int-regs 1368579 V int-regs 1368575

##shl-imm V int-regs 1368579 V int-regs 1368579 2

##copy V int-regs 1368582 V int-regs 1368579

##sar-imm V int-regs 1368582 V int-regs 1368582 3

##add-imm V int-regs 1368583 V int-regs 1368574 13

##add V int-regs 1368585 V int-regs 1368582 V int-regs 1368583

##alien-unsigned-4 V int-regs 1368586 V int-regs 1368585

##copy V int-regs 1368587 V int-regs 1368586

##shl-imm V int-regs 1368587 V int-regs 1368587 3

##peek V int-regs 1368588 D 2

##copy V int-regs 1368590 V int-regs 1368588

##xor V int-regs 1368590 V int-regs 1368590 V int-regs 1368587

##peek V int-regs 1368591 R -4

##peek V int-regs 1368592 R -5

##slot-imm V int-regs 1368597 V int-regs 1368592 3 2

##copy V int-regs 1368602 V int-regs 1368591

##shl-imm V int-regs 1368602 V int-regs 1368602 2

##copy V int-regs 1368605 V int-regs 1368602

##sar-imm V int-regs 1368605 V int-regs 1368605 3

##add-imm V int-regs 1368606 V int-regs 1368597 13

##add V int-regs 1368608 V int-regs 1368605 V int-regs 1368606

##copy V int-regs 1368610 V int-regs 1368590

##sar-imm V int-regs 1368610 V int-regs 1368610 3

##set-alien-integer-4 V int-regs 1368608 V int-regs 1368610

##peek V int-regs 1368611 R -2

##peek V int-regs 1368612 R -3

##peek V int-regs 1368613 R -1

##add-imm V int-regs 1368620 V int-regs 1368611 8

##replace V int-regs 1368620 D 2

##replace V int-regs 1368612 D 0

##replace V int-regs 1368613 D 1

_branch 4

_label 11

##inc-d -3

_label 12

##inc-r -1

##inc-d 1

##peek V int-regs 1368626 R -1

##load-immediate V int-regs 1368628 0

##set-slot-imm V int-regs 1368628 V int-regs 1368626 3 2

##replace V int-regs 1368628 D 0

_label 13

##inc-r -1

##peek V int-regs 1368634 R -1

##slot-imm V int-regs 1368639 V int-regs 1368634 2 2

##slot-imm V int-regs 1368644 V int-regs 1368639 3 2

##peek V int-regs 1368645 D 0

##copy V int-regs 1368649 V int-regs 1368645

##shl-imm V int-regs 1368649 V int-regs 1368649 2

##copy V int-regs 1368652 V int-regs 1368649

##sar-imm V int-regs 1368652 V int-regs 1368652 3

##add-imm V int-regs 1368653 V int-regs 1368644 13

##add V int-regs 1368655 V int-regs 1368652 V int-regs 1368653

##alien-unsigned-4 V int-regs 1368656 V int-regs 1368655

##copy V int-regs 1368657 V int-regs 1368656

##shl-imm V int-regs 1368657 V int-regs 1368657 3

##copy V int-regs 1368661 V int-regs 1368657

##sar-imm V int-regs 1368661 V int-regs 1368661 14

##copy V int-regs 1368662 V int-regs 1368661

##shl-imm V int-regs 1368662 V int-regs 1368662 3

##copy V int-regs 1368665 V int-regs 1368657

##xor V int-regs 1368665 V int-regs 1368665 V int-regs 1368662

##copy V int-regs 1368669 V int-regs 1368665

##shl-imm V int-regs 1368669 V int-regs 1368669 7

##load-immediate V int-regs 1368670 21095429120

##copy V int-regs 1368673 V int-regs 1368669

##and V int-regs 1368673 V int-regs 1368673 V int-regs 1368670

##copy V int-regs 1368676 V int-regs 1368665

##xor V int-regs 1368676 V int-regs 1368676 V int-regs 1368673

##copy V int-regs 1368680 V int-regs 1368676

##shl-imm V int-regs 1368680 V int-regs 1368680 15

##load-immediate V int-regs 1368681 32181846016

##copy V int-regs 1368684 V int-regs 1368680

##and V int-regs 1368684 V int-regs 1368684 V int-regs 1368681

##copy V int-regs 1368687 V int-regs 1368676

##xor V int-regs 1368687 V int-regs 1368687 V int-regs 1368684

##copy V int-regs 1368691 V int-regs 1368687

##sar-imm V int-regs 1368691 V int-regs 1368691 21

##copy V int-regs 1368692 V int-regs 1368691

##shl-imm V int-regs 1368692 V int-regs 1368692 3

##copy V int-regs 1368695 V int-regs 1368687

##xor V int-regs 1368695 V int-regs 1368695 V int-regs 1368692

##replace V int-regs 1368695 D 0

##slot-imm V int-regs 1368701 V int-regs 1368634 3 2

##add-imm V int-regs 1368704 V int-regs 1368701 8

##set-slot-imm V int-regs 1368704 V int-regs 1368634 3 2

##epilogue

##return

Finally, after register allocation, these instructions map pretty much directly onto machine code:

0x0000000109901940: add $0x8,%r15

0x0000000109901944: add $0x8,%r14

0x0000000109901948: mov -0x8(%r14),%rax

0x000000010990194c: mov %rax,(%r15)

0x000000010990194f: mov 0x16(%rax),%rcx

0x0000000109901953: mov %rcx,(%r14)

0x0000000109901956: cmp $0x1380,%rcx

0x000000010990195d: jge 0x109901973

0x0000000109901963: sub $0x8,%r14

0x0000000109901967: mov 0x8(%r14),%rcx

0x000000010990196b: mov %rcx,(%r14)

0x000000010990196e: jmpq 0x109901b5b

0x0000000109901973: add $0x8,%r15

0x0000000109901977: add $0x8,%r14

0x000000010990197b: mov -0x10(%r14),%rcx

0x000000010990197f: mov %rcx,(%r15)

0x0000000109901982: mov $0x1380,%rax

0x000000010990198c: mov 0xe(%rcx),%rdx

0x0000000109901990: mov $0x0,%rcx

0x000000010990199a: mov %rdx,(%r14)

0x000000010990199d: mov %rax,-0x8(%r14)

0x00000001099019a1: mov %rcx,-0x10(%r14)

0x00000001099019a5: nop

0x00000001099019a6: nop

0x00000001099019a7: nop

0x00000001099019a8: nop

0x00000001099019a9: nop

0x00000001099019aa: nop

0x00000001099019ab: nop

0x00000001099019ac: nop

0x00000001099019ad: nop

0x00000001099019ae: nop

0x00000001099019af: nop

0x00000001099019b0: mov -0x10(%r14),%rcx

0x00000001099019b4: cmp $0x1380,%rcx

0x00000001099019bb: jge 0x109901b3a

0x00000001099019c1: add $0x38,%r15

0x00000001099019c5: mov -0x8(%r14),%rcx

0x00000001099019c9: mov (%r14),%rdx

0x00000001099019cc: mov %rcx,-0x30(%r15)

0x00000001099019d0: mov -0x10(%r14),%rcx

0x00000001099019d4: mov %rdx,-0x20(%r15)

0x00000001099019d8: mov %rcx,-0x28(%r15)

0x00000001099019dc: mov %rdx,-0x10(%r15)

0x00000001099019e0: mov %rcx,-0x18(%r15)

0x00000001099019e4: mov %rdx,(%r15)

0x00000001099019e7: mov %rcx,-0x8(%r15)

0x00000001099019eb: mov 0x16(%rdx),%rax

0x00000001099019ef: mov %rcx,%rdx

0x00000001099019f2: shl $0x2,%rdx

0x00000001099019f6: sar $0x3,%rdx

0x00000001099019fa: lea 0xd(%rax),%rbx

0x00000001099019fe: lea (%rdx,%rbx,1),%rax

0x0000000109901a02: mov (%rax),%edx

0x0000000109901a04: shl $0x3,%rdx

0x0000000109901a08: mov $0x400000000,%rax

0x0000000109901a12: and %rax,%rdx

0x0000000109901a15: lea 0x8(%rcx),%rax

0x0000000109901a19: shl $0x2,%rax

0x0000000109901a1d: sar $0x3,%rax

0x0000000109901a21: lea (%rax,%rbx,1),%rcx

0x0000000109901a25: mov (%rcx),%eax

0x0000000109901a27: shl $0x3,%rax

0x0000000109901a2b: mov $0x3fffffff8,%rcx

0x0000000109901a35: and %rcx,%rax

0x0000000109901a38: or %rax,%rdx

0x0000000109901a3b: mov %rdx,%rax

0x0000000109901a3e: sar $0x4,%rax

0x0000000109901a42: shl $0x3,%rax

0x0000000109901a46: mov %rax,-0x10(%r14)

0x0000000109901a4a: and $0x8,%rdx

0x0000000109901a4e: mov $0x4c84586f8,%rax

0x0000000109901a58: mov $0x0,%rcx

0x0000000109901a62: mov %rcx,(%r14)

0x0000000109901a65: mov %rax,-0x8(%r14)

0x0000000109901a69: cmp $0x8,%rdx

0x0000000109901a6d: jne 0x109901a7c

0x0000000109901a73: sub $0x8,%r14

0x0000000109901a77: jmpq 0x109901a87

0x0000000109901a7c: sub $0x8,%r14

0x0000000109901a80: mov 0x8(%r14),%rcx

0x0000000109901a84: mov %rcx,(%r14)

0x0000000109901a87: sub $0x8,%r15

0x0000000109901a8b: mov -0x8(%r14),%rcx

0x0000000109901a8f: mov (%r14),%rax

0x0000000109901a92: xor %rax,%rcx

0x0000000109901a95: mov %rcx,-0x8(%r14)

0x0000000109901a99: mov (%r15),%rcx

0x0000000109901a9c: mov 0x8(%r15),%rax

0x0000000109901aa0: lea 0xc68(%rcx),%rdx

0x0000000109901aa7: mov %rax,(%r15)

0x0000000109901aaa: mov %rdx,(%r14)

0x0000000109901aad: cmp $0x1380,%rdx

0x0000000109901ab4: jle 0x109901ac7

0x0000000109901aba: mov (%r14),%rdx

0x0000000109901abd: lea -0x1380(%rdx),%rax

0x0000000109901ac4: mov %rax,(%r14)

0x0000000109901ac7: sub $0x30,%r15

0x0000000109901acb: add $0x8,%r14

0x0000000109901acf: mov 0x30(%r15),%rax

0x0000000109901ad3: mov 0x16(%rax),%rdx

0x0000000109901ad7: mov -0x8(%r14),%rax

0x0000000109901adb: shl $0x2,%rax

0x0000000109901adf: sar $0x3,%rax

0x0000000109901ae3: lea 0xd(%rdx),%rcx

0x0000000109901ae7: lea (%rax,%rcx,1),%rdx

0x0000000109901aeb: mov (%rdx),%ecx

0x0000000109901aed: shl $0x3,%rcx

0x0000000109901af1: mov -0x10(%r14),%rdx

0x0000000109901af5: xor %rcx,%rdx

0x0000000109901af8: mov 0x20(%r15),%rcx

0x0000000109901afc: mov 0x28(%r15),%rax

0x0000000109901b00: mov 0x16(%rax),%rbx

0x0000000109901b04: shl $0x2,%rcx

0x0000000109901b08: sar $0x3,%rcx

0x0000000109901b0c: lea 0xd(%rbx),%rax

0x0000000109901b10: lea (%rcx,%rax,1),%rbx

0x0000000109901b14: sar $0x3,%rdx

0x0000000109901b18: mov %edx,(%rbx)

0x0000000109901b1a: mov 0x10(%r15),%rdx

0x0000000109901b1e: mov 0x18(%r15),%rbx

0x0000000109901b22: mov 0x8(%r15),%rax

0x0000000109901b26: lea 0x8(%rdx),%rcx

0x0000000109901b2a: mov %rcx,-0x10(%r14)

0x0000000109901b2e: mov %rbx,(%r14)

0x0000000109901b31: mov %rax,-0x8(%r14)

0x0000000109901b35: jmpq 0x1099019b0

0x0000000109901b3a: sub $0x18,%r14

0x0000000109901b3e: sub $0x8,%r15

0x0000000109901b42: add $0x8,%r14

0x0000000109901b46: mov 0x8(%r15),%rcx

0x0000000109901b4a: mov $0x0,%rax

0x0000000109901b54: mov %rax,0x16(%rcx)

0x0000000109901b58: mov %rax,(%r14)

0x0000000109901b5b: sub $0x8,%r15

0x0000000109901b5f: mov 0x8(%r15),%rax

0x0000000109901b63: mov 0xe(%rax),%rcx

0x0000000109901b67: mov 0x16(%rcx),%rbx

0x0000000109901b6b: mov (%r14),%rcx

0x0000000109901b6e: shl $0x2,%rcx

0x0000000109901b72: sar $0x3,%rcx

0x0000000109901b76: lea 0xd(%rbx),%rdx

0x0000000109901b7a: lea (%rcx,%rdx,1),%rbx

0x0000000109901b7e: mov (%rbx),%edx

0x0000000109901b80: shl $0x3,%rdx

0x0000000109901b84: mov %rdx,%rbx

0x0000000109901b87: sar $0xe,%rbx

0x0000000109901b8b: shl $0x3,%rbx

0x0000000109901b8f: xor %rbx,%rdx

0x0000000109901b92: mov %rdx,%rbx

0x0000000109901b95: shl $0x7,%rbx

0x0000000109901b99: mov $0x4e962b400,%rcx

0x0000000109901ba3: and %rcx,%rbx

0x0000000109901ba6: xor %rbx,%rdx

0x0000000109901ba9: mov %rdx,%rbx

0x0000000109901bac: shl $0xf,%rbx

0x0000000109901bb0: mov $0x77e300000,%rcx

0x0000000109901bba: and %rcx,%rbx

0x0000000109901bbd: xor %rbx,%rdx

0x0000000109901bc0: mov %rdx,%rbx

0x0000000109901bc3: sar $0x15,%rbx

0x0000000109901bc7: shl $0x3,%rbx

0x0000000109901bcb: xor %rbx,%rdx

0x0000000109901bce: mov %rdx,(%r14)

0x0000000109901bd1: mov 0x16(%rax),%rdx

0x0000000109901bd5: lea 0x8(%rdx),%rbx

0x0000000109901bd9: mov %rbx,0x16(%rax)

0x0000000109901bdd: retq

That's pretty tight code for a dynamic language.

### Implementation of specialized arrays

So there are 13 C types here:

`char uchar short ushort int uint long ulong longlong ulonglong float double void*`

. For each C type, we have four vocabularies:

`specialized-arrays.`*type*

,

`specialized-vectors.`*type*

,

`specialized-arrays.direct.`*type*

, and

`io.mmap.`*type*

. Implementing all of this by hand would be a pain: even with the best abstractions in Factor's object system, at a minimum there would be 4 or 5 lines of code per type and per feature, and ideally you'd want to clone some methods once per implementation instead of abstracting everything out through dynamic dispatch, for performance reasons. So instead, I did what any self-respecting programmer would do in this situation: I made the computer do all the hard work. Factor lacked the abstraction necessary to express what I wanted, but Factor is extensible, so I just implemented the abstraction -- as a library. The feature we want here is essentially C++ templates. So yes, there is now a

`functors`

vocabulary which implements what amounts to C++ templates... in Factor. I called them "functors", even though they're more like templates than ML functors, because the term "template" has meaning in Factor's web framework and I didn't want to create confusion. I don't expect functors will see widespread use and I certainly don't want to promote this as a central feature of Factor. It is just another library that happens to extend the parser -- like many other Factor libraries do.

Let's look at how functors are used. For example, here is the entire text of the

`specialized-arrays.direct.int`

vocabulary:

**USING: **specialized-arrays.int specialized-arrays.direct.functor **;**

**IN:**** ****specialized-arrays.direct.int**

<< "int" define-direct-array >>

The key of course is that its invoking the

`define-direct-array`

word, at parse time (hence the

`<<`

and

`>>`

). This word is defined in the

`specialized-arrays.direct.functor`

vocabulary, and its definition looks quite unusual:

FUNCTOR: define-direct-array ( **T**** **--** **)

A' IS ${T}-array

>A' IS >${T}-array

<A'> IS <${A'}>

A DEFINES direct-${T}-array

<A> DEFINES <${A}>

NTH [ T dup c-getter array-accessor ]

SET-NTH [ T dup c-setter array-accessor ]

WHERE

TUPLE: A

{ underlying alien read-only }

{ length fixnum read-only } **;**

**:**** ****<A>** ( **alien**** ****len**** **--** ****direct****-****array**** **) A boa **;** inline

M: A length length>> **;**

M: A nth-unsafe underlying>> NTH call **;**

M: A set-nth-unsafe underlying>> SET-NTH call **;**

M: A like drop dup A instance? [ >A' execute ] unless **;**

M: A new-sequence drop <A'> execute **;**

INSTANCE: A sequence

;FUNCTOR

This

`FUNCTOR: ... IS / DEFINES ... WHERE ... ;FUNCTOR`

business is not new syntax in the language; its just parsing words defined in the

`functors`

vocabulary. Essentially we are defining a word,

`define-direct-array`

, which generates code, parameterized by a single input

`T`

, which is a C type name in this case. Note that what's going on here is more elaborate than text substitution; the functor body is parsed once, and converted into code which instantiates it, rather than simply expanding at usages like C preprocessor macros do.

Functors may seem like an elaborate new piece of functionality, comparable in complexity to the object system or locals, but in reality, the emperor has no clothes; they have a trivial desugaring.

First, inside a functor body, the

`TUPLE:`

,

`:`

,

`M:`

and

`INSTANCE:`

are shadowed to special words in the

`functors`

vocabulary named

``TUPLE:`

,

``:`

, and so on.

Next,

`IS`

and

`DEFINES`

desugar as follows.

A' IS ${T}-array

expands into

A' [ [ I[ ${T}-array]I ] with-string-writer dup search [ ] [ no-word ] ?if ]

and

<A> DEFINES <${A}>

expands into

<A> [ [ I[ <${A}>]I ] with-string-writer create-in ]

The

`I[`

syntax is just the

`interpolate`

library, which turns this,

I[ Hello, ${name}.]I

into this,

"Hello, " write name write "." write

Now,

FUNCTOR: foo ( inputs -- )

...names...

WHERE

...body...

;FUNCTOR

is really just syntax sugar for

:: foo ( inputs -- )

[let* | ...names... | ...body... ] ;

With the added caveat that inside the body,

``:`

is remapped to

`:`

and so on.

As you can see, a seemingly powerful feature is melting away as it becomse apparent its only some rather trivial syntax sugar. Words such as

``:`

desugar like so;

`: foo 2 2 + . ;

becomes

foo [ 2 2 + . ] define

Let's look at a functor and its desugaring. Here is a simple functor, it takes a tuple class name as input, and defines a constructor word

`<foo>`

which makes a new instance of this tuple class using

`new`

. So it is like the built-in parsing word

`C:`

, except it uses

`new`

instead of

`boa`

:

FUNCTOR: make-constructor ( **name**** **--** **)

tuple IS ${name}

<tuple> DEFINES <${name}>

WHERE

**:**** ****<tuple>** tuple new **;**

;FUNCTOR

Given the above, this functor is really equivalent to the following:

:**:**** ****make-constructor** ( **name**** **--** **)

[let* | tuple [ name dup search [ ] [ no-word ] ?if ]

<tuple> [ [ "<" write name write ">" write ] with-string-writer ] |

<tuple> [ tuple new ] define

] **;**

We could keep going -- the

`::`

word, which defines a word with named parameters, is it self a library word, in the

`locals`

vocabulary, which desugars down to stack code.

In this simple example, the hand-coded version is shorter than the functor, but the point of using a functor is that you can make the meta-code look like the code that it is generating. For the more complex functors that generate the various forms of specialized arrays, this is invaluable because it makes the code easier to maintain and debug.

Finally, the stuff described in this post, especially the implementation of functors, is some pretty advanced stuff. I don't expect most people using Factor to know or care about how this works; after all, you can use specialized arrays without "peeking under the hood", so to speak. However, all of this meta-programming power is there when you are ready for it.