Friday, August 17, 2007

gcc is open sores software

In Factor 0.90, I added a new function to vm/os-linux.c which returns the path of the current executable. However it seems that neither gcc 4.0.3 nor gcc 4.2.1 can compile this file on x86. I get an error like the following:
vm/os-linux.c: In function 'vm_executable_path_and':
vm/os-linux.c:20: error: unable to find a register to spill in class 'DIREG'
vm/os-linux.c:20: error: this is the insn:
(insn:HI 16 83 17 2 (parallel [
(set (reg:SI 2 cx [66])
(unspec:SI [
(mem:BLK (reg/v/f:SI 63 [ suffix ]) [0 A8])
(reg:QI 0 ax [70])
(const_int 1 [0x1])
(reg:SI 2 cx [69])
] 20))
(use (reg:SI 19 dirflag))
(clobber (reg/f:SI 68 [ suffix ]))
(clobber (reg:CC 17 flags))
]) 530 {*strlenqi_1} (insn_list:REG_DEP_TRUE 6 (insn_list:REG_DEP_TRUE 12 (insn_list:REG_DEP_TRUE 14 (insn_list:REG_DEP_TRUE 15 (nil)))))
(expr_list:REG_DEAD (reg:SI 19 dirflag)
(expr_list:REG_DEAD (reg:SI 2 cx [69])
(expr_list:REG_DEAD (reg:QI 0 ax [70])
(expr_list:REG_UNUSED (reg:CC 17 flags)
(expr_list:REG_UNUSED (reg/f:SI 68 [ suffix ])
(expr_list:REG_EQUAL (unspec:SI [
(mem:BLK (reg/v/f:SI 63 [ suffix ]) [0 A8])
(reg:QI 0 ax [70])
(const_int 1 [0x1])
(reg:SI 2 cx [69])
] 20)
(nil))))))))

In fact, here is a test case which demonstrates the problem; compile it on x86 with -O3,using gcc 4.2.1, 4.0.3 or 3.4.6 (I tested them all):
#include <stdlib.h>
#include <string.h>

register long foo asm("esi");
register long bar asm("edi");

char * crash_me_baby(char *str) {
char *path = malloc(1 + strlen(str));
return path;
}

I'm sick and tired of the gcc team's total unwillingness to support basic, documented, features. In this case, it is the global register variables which trigger the bug. However, the crash_me_baby() function does not use these variables at all, and in any case they are non-volatile registers which need to be saved/restored, so why the hell would it break gcc?

I submitted a bug to the gcc team. But I'm not holding my breath; I've already filed a report about the same issue a few years ago. This is a recurring problem which has been coming and going since the days of gcc 3.3.

The real way forward is to stop using register global variables. When the Factor VM no longer includes an interpreter, and all quotations are compiled, this will be possible without adversely affecting performance. It will also have the side benefit of making the Factor VM pure ANSI C, with some inline assembly. Which means that on Windows for example, we should soon be able to compile Factor with an alternative compiler, such as Microsoft's Visual Studio.

For now, I've found a workaround -- I had to write my own version of strlen().

9 comments:

Anonymous said...

Good evening Slava,

In yor comment about removal of the interpreter, are you saying that there will no interface as currently or that all definitions will be automatically compiled?

regards

Bruce Rennie
(God's Own Country Downunder)

Slava Pestov said...

Everything will be compiled. This won't affect the tooling at all.

Curtis said...

This is not surprising if you're wanting esi and edi all for yourself. The library is probably trying to use repne scasb to find the null at the end. That instruction implicitly uses esi hence this problem.

astrange said...

gcc expands strlen() to x86's "repnz/scasb" which only takes input from edi. This is a major reason why x86 is the WORST ARCHITECTURE EVER.

But anyway it'll compile with -fno-builtin-strlen. Although naturally it shouldn't ICE.

Anonymous said...

Of course it sucks that there is a regression, that the bug you reported a few years ago pops up again.

But this only makes me wonder: don't they have a test suite to test the simple enough feature (as your test is only a couple of LOC, and the documentation isn't zillions of pages either) like global register variables? And most of all - why?

pankkake said...

Did you try -O2 instead?

Slava Pestov said...

pankkake: -O, -O1, -O2 all have the same issue.

Slava Pestov said...

Thanks guys, -fno-builtin-strlen and -fno-builtin-strcat fixed the problem.

However, I don't see why a register global should affect the string instructions at all. All gcc has to do is emit a push/pop around the string instructions to save/restore edi.

Anonymous said...

My experience with gcc is the opposite. I've found a couple compiler bugs and after creating the minimal test case and reporting the bug to their bugzilla the maintainers were very helpful. I think everything I've ever submitted has been fixed... Odd that your bug has lasted so long!