Over the last few weeks I've been working on adding thread support to the SBCL Solaris/x86 port. Compared to what Cyrus Harmon has been going through with OS X and FreeBSD threads, I've had pretty smooth sailing so far. Only a few annoyances:

  • Some signal handling suckage: you can only queue POSIX real-time signals to a process on Solaris, there's no way to queue them for a specific thread like on Linux.

  • The Solaris API for x86 local descriptor table access is rather disgusting. For example, to find out which LDT indexes are in use, you need to read binary gunk from /proc directly into C structs. Oh, and also to know the magic LDT indexes that the system uses, but which are not listed in the /proc file.

  • It's to be expected that all non-Linux SBCL porters will complain bitterly about gdb being absolutely useless on their platforms. Solaris is no exception here: both threads and signal handling are completely broken, as far as I can tell. But Solaris trumps the others by delivering at least two other debuggers that are also mostly useless (in different ways).

  • Futexes would've been nice...

  • Good Lord, the userland sucks. Do you know how many times a day I type something like "cp foo foo.`date -I`"? More often than I would've thought, before trying to use a system without date -I. At least 90% of my grep invocations seem to use the -r switch, which Solaris grep doesn't support. And where's my ps aux? AARGH! Yes, these are tiny issues with non-standard or obsolete switches, and different versions of the tools could probably be found in /usr/xgrgle5/bin/, or something. Doesn't make it any less annoying.

But I really shouldn't complain too much, since there have been some positive things too.

  • All the sadistic thread tests in the SBCL regression suite have been passing for a couple of weeks now. I've seen none of the kernel panics and mysterious crashes that Cyrus has been having on the other OSs.

  • I've only had one really bizarre OS-related bug so far, and I could solve it by reading the Solaris source code. Also, it turned out to be my fault in the end. (An issue with the semantics of the x86 single-step flag in signal handlers, if you want to know).

  • In contrast to the debugger suckage, it's great to have tracing tools that really work. Like a strace truss (AARGH) that doesn't barf on SIGSEGVs! And the much-hyped dtrace turned out to be pretty useful in figuring out some hairy timing issues.

The main issue that needs to be solved before this work can be merged to HEAD is dealing with releasing the memory allocated for the pthread condition variable / mutex pair that constitutes a lutex. On Linux we don't need to do this thanks to the magic of futexes. This appears to be solved as of 30 minutes ago: I've been writing this post during the agonizingly slow recompiles, and the very latest version is passing all existing tests and not leaking memory. Some further testing and cleanups required, but looking good.

So expect the first non-Linux SBCL port with thread support to land into the CVS soonish. (Probably after Hamburg). Special thanks to:

  • Cyrus for doing the first 20-something commits on the lutex-branch, pre-emptively fixing all kinds of things that I would've run into, probably sooner rather than later.

  • Nathan Froyd for the original lutex implementation. I suspect that we've replaced both the head and the handle by now, but it's still fundamentally the same axe.

  • Tellme for funding the Solaris/SBCL threads work.