Friday, June 27, 2008

Windows handle inheritance

I was debugging the deploy tool yesterday. The problem was that after I changed the HTTP request/response parsers to use PEGs, the HTTP client stopped working in deployed applications. After fixing this, I added a regression test to ensure this won't be a problem in the future. The test passed on my Mac, I pushed my fixes, and moved on. However, soon I got a notification from the Windows build machine that the test failed on Windows.

After some further investigation, I noticed that if I start the HTTP server, and then spawn the deployed binary from the command line, the test works, but if I spawn the deployed binary from within Factor using run-process, it would fail. So this was not even specific to deployment, or the HTTP code; it would happen any time the following series of events occurred:
  • Factor instance A starts a TCP/IP server socket on port 1234
  • Factor intsance A spawns a Factor instance B
  • Factor instance B attempts to connect to port 1234
  • It times out while establishing the connection

I then realized that handle inheritance was the problem here. It turns out my understanding of handle inheritance was wrong, and this article set me straight. Turns out any handle you create is inheritance by default, so the Factor instance B was inheriting the server socket from A, so when B was attempting to connect to the server socket, it would hit the inherited copy, and not the copy in A which was actually being listened on.

Refactoring the Windows I/O code to set all handles to be non-inheritable by default, and only setting the inheritance flag on handles that are passed to subprocesses for I/O redirection, fixed the problem.

Just goes to show how important continuous integration is.

No comments: