Thursday, August 28, 2008

In case you're wondering about the differences among select(2), poll(2), and the device-based polling approaches, they all do fundamentally the same thing -- notify the program when a socket is ready for reading or writing or when an error occurs on a socket. Select() came first in the history of UNIX, but it has a basic limitation on the number of sockets that can be monitored. It's 1024 in most cases, unless you recompile. Poll() solves this problem; it requires the programmer to allocate the array of socket structures that she passes to the system call, so in theory you can monitor an arbitrary number of sockets. (Poll() is also superior to select() in that it gives the programmer finer control over the kinds of events to watch for.) The problem that poll() has only really occurs on servers that handle an enormous number of sockets. That problem is, simply, that the array of socket structures must be copied from user space to kernel space and back again every time you call poll(). So far as I recall, this performance bottleneck came to light in the early 2000s when people were doing research on the scalability of Linux, but that's just a vague memory. The first implementation could have been done in FreeBSD or Solaris. At any rate, the user-kernel-user copy problem is the reason for what I call the device-based approaches (because they involve a file in /dev). With epoll (as it's known on Linux), the program tells the operating system which particular sockets to monitor, and the operating system tells the program when a particular socket has changed. It only tells the program that that particular socket changed. It doesn't say, "Hey, here is the entire of array of sockets you care about, and it's up to you to examine the structures to figure out which ones changed." So device-based polling is useful if you are polling a large number of sockets. Otherwise, plain old poll() should work just fine.

Friday, August 15, 2008

Thinking about domain-specific languages (DSLs) ... Generally, it is easier to keep track of the role of each argument to some function/method in languages with keyword parameters (e.g. Python, Ruby). Names are easier to remember than positions in a parameter list. In a language without keyword parameters, how do you make it easy to remember which parameter is what (putting aside for the moment the usefulness of IDEs in displaying the function/method signature for you)? Here's an example of how to do that in Java. Take this function:

public void validateState(PBXConference conference, int added, int connecting, int connected, int disconnecting, int disconected) {
assertEquals(conference.added(), added);
assertEquals(conference.connecting(), connecting);
assertEquals(conference.connected(), connected);
assertEquals(conference.disconnecting(), disconnecting);
assertEquals(conference.disconnected(), disconnecting);

A client would invoke it like:
validateState(conference, 2, 1, 1, 0, 0);

but that sequence of numbers doesn't help the readability of the test. So instead, while it's a bit more verbose, we can change the function definition to:
public class ConferenceStateValidator {
private PBXConference conference;

private ConferenceStateValidator(PBXConference conference) {
this.conference = conference;

public ConferenceStateValidator added(int n) {
assertEquals(conference.added(), n);
return this;

public ConferenceStateValidator connecting(int n) {
assertEquals(conference.connecting(), n);
return this;

public ConferenceStateValidator connected(int n) {
assertEquals(conference.connected(), n);
return this;

public ConferenceStateValidator disconnecting(int n) {
assertEquals(conference.disconnecting(), n);
return this;

public ConferenceStateValidator disconnected(int n) {
assertEquals(conference.disconnected(), n);
return this;

public static ConferenceStateValidator validateState(PBXConference conference) {
return new ConferenceStateValidator(conference);

And the client (assuming they've statically imported validateState), can do:

which is much cleaner.

Wednesday, August 13, 2008

From Mozilla Labs, an idea whose time is coming. I was struck by this bit though.

Our next step is to gather feedback on the prototype and the ideas behind it. We want to know if the concept has promise and is worth pursuing further. We’re particularly interested in feedback on how messaging might fit into the browsing experience and if there are other interfaces (or refinements to the two interfaces built into the prototype) that would make it easier for users to have online conversations.

We’re still considering what may come after that, but possible extensions to the Snowl prototype include:

  • support for additional message sources, e.g. Facebook, AIM, Google Talk, etc.;
  • an interface for writing and sending messages to enable true two-way conversations;
Since Facebook and Google Talk already support or are going to support XMPP, the only question is whether Snowl will support it too. Other chat services should just follow suit. In other words, there's no point in mentioning "additional message sources" when all of those sources use XMPP. Just mention XMPP!

Thursday, August 07, 2008

Maybe a couple of months ago some blogger whose feed is aggregated at Planet Intertwingly (I don't remember who) wrote a post summarizing his complaints about Erlang. One complaint was about extracting a value from a tuple. Say you assign a tuple to some variable, as in

1> X = {a, 10}.
and you want the value of the second element of the tuple. How do you do that? Conventionally,
2> {_, Y} = X.
3> Y.
Now the variable Y has the value 10. (In Erlang, the underscore is the anonymous variable.) If a tuple has a large number of fields or contains nested tuples, such tuple-unpacking statements are unwieldy. And that's what the guy objected to. It's too verbose and easy to botch.

Fortunately, pattern matching provides a simple way to work around this. Define a function that matches the tuple (particularly the first atom)
4> ValueOf = fun({a, Value}) -> Value end.
5> ValueOf(X).
If you define one function for each element in a tuple, you get accessor methods for the tuple and you don't have to continue writing long expressions with a lot of anonymous variables.

Saturday, August 02, 2008

Here's the second GPS email for Bob's hike:

SPOT Check OK. All is well !! Bob
Nearest Location: Elk Lake, United States
Distance: 0 km(s)
Time:07/31/2008 22:35:38 (US/Mountain),-121.8084&ie=UTF8&z=12&om=1

A while ago I tweeted about the possibility of the growth of more continental/regional/local manufacturing in response to the growing costs of transportation. Twitter has some problems preventing me from getting the permalink; even so, it sort of goes without saying that that will happen. Anyway, the Times now has an article about the phenomenon.