Comment support and thread safety

Posted on 2006-01-07 in Lisp
» 6 comments

My blog now has simple comment support. This turned out to be a much better way to procrastinate than I'd expected. Prior to this, everything in the system has been stored in files, but I decided that comments were better off being stored in a database. In retrospect, this probably wasn't such a good idea.

The initial coding took a few hours, much of which was spent figuring out the right way to use CLSQL. I was especially bit by the caching that CLSQL does, which caused some problems that were hard to diagnose. After being bitten couple of times I just turned it off. Having caching on by default seems like a bad choice.

The real fun started once I decided to check that the system would still work with a light load, and ran ab (ApacheBench) with 5 concurrent processes accessing the web server. It failed on alarmingly many requests. So I got to spend most of the day debugging threads.

First problem in SB-BSD-SOCKETS. gethostbyname and gethostbyaddr return data in statically allocated buffers, which will be overwritten by the next call. SB-BSD-SOCKETS accounts for this by copying the data immediately after the call. However, it's possible for one thread to overwrite the buffer before another thread had time to copy the data to safety. Boom!

Second problem in the SBCL internal caches, which caused occasional nonsense errors like "STRING is a bad type specifier for sequences". Surprisingly easy to trigger once you figure out what's going on:

(defun random-type (n)
  `(integer ,(random n) ,(+ n (random n))))
(defun one-test ()
  (dotimes (i 10000)
    (let ((type1 (random-type 500))
          (type2 (random-type 500)))
      (let ((a (subtypep type1 type2)))
        (dotimes (i 100)
          (assert (eq (subtypep type1 type2) a))))))
  (format t "ok~%")
  (force-output))
(defun test ()
  (dotimes (i 10)
    (sb-thread:make-thread #'one-test)))

The heavy-handed solution is to sprinkle some magic pixie locks on all the functions created by DEFINE-HASH-CACHE. Unfortunately these functions are called very often, and the mutex overhead caused a 50% slowdown in the average page generation time. Definitely not committable in this state. Unfortunately the locks need to be recursive, so spinlocks as currently implemented were not an option.

Third problem between keyboard and chair, though I'll happily assign some of the blame to CLSQL. I forgot to specify the database for one call to SELECT, and it ended up using *DEFAULT-DATABASE*. This wouldn't have been too bad, except that WITH-DATABASE has a really strange feature: instead of just binding the connection to the specified variable it will also SETF it before establishing the new binding. I.e.

CL-USER> (progn
           (setf clsql::*default-database* nil)
           (clsql:with-database (clsql::*default-database* *db-spec*))
           clsql::*default-database*)
#<CLSQL-POSTGRESQL-SOCKET:POSTGRESQL-SOCKET-DATABASE localhost:7432/blog/jsnell CLOSED {100289EF71}>

Due to the above points the same connection ended up getting used from multiple threads at the same time in certain circumstances, with predictably bad results. I spent a couple of merry hours debugging this as a fd-stream problem before realizing the mistake.

After fixing these the Araneida instance has now survived without errors for 100000 requests with 10 concurrent client processes and another 100000 with 20 clients. That should suffice for now, and it doesn't really matter that handling the average page request takes 25ms instead of 18ms.

Next » Recent SBCL activity (2006-01-11)
Previous » Feed moving (2006-01-04)

Comments

By Craig Turner on 2006-01-08

I've been working through Practical Common Lisp today, and thought I'd do some procrastination of my own by checking out planet.lisp.org where I saw your post. This is a big offtopic, but I'm thinking about rel databases and lisp at the moment and you triggered that...

One thing that concerns me every time I look at lisp is that as far as I can see I won't be able to use a persistence framework without writing one myself.

I come from a webobjects background and like to use such a tool in my projects. In webobjects, the persistence framework called 'the EOModel', but the last twelve months I've been using is called cayenne (which I'm told is similar to hibernate - all of these are java-based).

All these strategies are heavily object-oriented. You create wrapper classes for each table. Within the application there are one to many object graphs which the application builds up before saying 'commit' causing them all to get written to the database (SQL and alignment of ids are taken care of).

Is there a lispy way of achieving these sort of outcomes with a relational database?

One workaround I've been thinking I might be able to use in lisp would be just to do a lot more in stored procedures on the database side. But I'm not a big fan of this approach and my gut feel tells me that it won't be as powerful as a persistence model.

By Juho Snellman on 2006-01-08

CLSQL has some object-relational mapping support (DEF-VIEW-CLASS, :JOIN-CLASS). Have you found this inadequate?

I haven't really looked at the CLSQL OR stuff in great detail, since 2.5 years of using Hibernate at my previous jobs has left me sceptical of high abstraction level/"idiomatic" OR mapping layers.

(Incidentally, we first found out about Hibernate from a comp.lang.lisp posting, and it was indeed much better than the piece of crap we were using before that. Castor?).

I don't really know what the right solution is. Some people use object dbs in Lisp (Elephant, AllegroCache, bknr). Others store the data in-memory and only dump snapshots or transaction logs to disk.

By Kevin Rosenberg on 2006-01-09

I agree about caching. I don't like it either. But, it is on by default for CommonSQL compatibility sake.

By Peter Van Eynde on 2006-01-11

Should the caches not be thread-specific? The sockets stuff is a clear bug.

Good work actually, this is very much needed stuff.

By Peter Van Eynde on 2006-01-11

Should the caches not be thread-specific? The sockets stuff is a clear bug.

Good work actually, this is very much needed stuff.

By Juho Snellman on 2006-01-11

The caches are global in nature, and would need to be kept coherent between threads. It seems to me that this would require as much locking as in my kludge, but maybe there's some neat lockless way to do it (my concurrency knowledge is pretty basic).

Peter probably already knows this, being the Debian SBCL maintainer, but for the benefit of others who read this: Gabor Melis has been systematically fixing various SBCL thread issues over the last 9 months or so.

Name
Message

As an antispam measure, you need to write a super-secret password below. Today's password is "xyzzy" (without the quotes).

Password

Formatting guidelines for comments: No tags, two newlines can be used for separating paragraphs, http://... is automatically turned into a link.