Tuesday, May 20, 2008

What's up with Digg.com?

I was testing Factor's http.client library with digg.com, and I was just getting timeouts.

Connecting with telnet now:
$ telnet digg.com 80
Trying 64.191.203.30...
Connected to digg.com.
Escape character is '^]'.
GET / HTTP/1.1
host: digg.com

Connection closed by foreign host.

So telnet knows the connection is being dropped, but Factor doesn't detect this. I thought this might be a bug in Factor, but with all other sites I've tried, Factor correctly detects a closed connection, whereas with Digg, read() on the socket always returns -1 with errno set to EAGAIN. I'm not even sure how telnet detects the connection is being dropped.

To make matters even more confusing, Java doesn't detect the socket was closed either; the following testcase hangs:
import java.net.*;
import java.io.*;

public class DiggTest
{
public static void main(String[] args) throws IOException
{
Socket s = new Socket("digg.com",80);

Writer w = new OutputStreamWriter(s.getOutputStream());
Reader r = new InputStreamReader(s.getInputStream());

w.write("GET / HTTP/1.1\r\nHost: www.digg.com\r\n\r\n");
w.flush();

int ch = r.read();
}
}

It turns out a TCP RESET packet is being sent, but for whatever reason, Factor and Java don't receive any notification of it, wheras telnet does.

I'm completely stumped at this point. Anybody have an idea of what's going on?

Update: As Jason points out in the comments, adding a User-Agent makes it work. I was aware of this already, but that's not the interesting part; what puzzles me is why Factor (and Java) don't pick up on the socket being closed.

2 comments:

BBC said...

No clue about the Reset, but if you add a User-Agent header it seems to work.

kirillkh said...

Windows telnet timess out, as well