Wednesday, January 23, 2008

What's up with non-blocking I/O APIs?

On Linux, epoll() works with ttys, pipes and sockets, but not files.

On Mac OS X, kqueue() works with sockets but not ttys.

On Windows, GetQueuedCompletionStatus() works with everything except for the console.

I think it is sad that non-blocking I/O APIs are typically added as afterthoughts. Instead, operating system I/O APIs should be designed around the notion of asynchronous events; blocking streams can be done in user-space with coroutines or continuations.

However what's done is done and we have to support today's OSes. On Mac OS X, the main reason I'm looking at kqueue() is to support non-blocking process exit notification as well as file system change notification. So I can add the kqueue file descriptor to the select() set and use select() for everything else.

On Linux, I'm not sure there's any point using epoll(). Perhaps I can throw sockets there and use select() for everything else, but this may just end up being more complexity than is necessary.

Anyway, support for the more advanced non-blocking I/O APIs available on Mac OS X and Linux was one of my big to do list items for Factor 1.0. Not only did it take less effort to implement than I thought, but the payoff doesn't seem so great either. At least we can now implement file change notification on OS X.

7 comments:

Steve said...

I imagine it would not be as efficient to implement bio using nbio in user space. However, if async-io was used then I imagine it is as efficient (same number of sysops). It's the memcopy from kernel to user space that I'm thinking of on a read...

Why do you need continuations/corourtines to implement bio in user space over nbio? Couldn't you just loop and poll? I imagine that continuations/coroutines are only need to implement some kind of featherweight thread system...

Nick Sieger said...

I take it you don't want an external dependency on a lib like libevent, liboop, or libev?

Slava Pestov said...

Nick: libevent and so on have the same problems because they use the same underlying OS APIs.

Slava Pestov said...

Steve: you're right, Factor has lightweight co-operative threads.

At some point we're adding native threads but even then we'll still support lightweight threads (ie, M:N threading) so having good non-blocking I/O is a long-term requirement.

Anonymous said...

FYI, you can't select() files on Linux. Well, you can but it will always indicate readable+writable.

Anonymous said...

Man, I wish the linux kernel people had gone with kqueue instead of inventing their own less-powerful thing.

The reason for preferring kqueue()/epoll() over poll() or select() for file descriptors is that they can efficiently deal with far more descriptors.

The BSD kqueue(), of course, can deal with a lot more types of events than just fd activity.

tutufan said...

Regarding epoll, I believe it's more efficient than select or poll for huge numbers of file descriptors...