Sunday, January 13, 2013

MongoDB: Avoid the "majority" write concern when using arbiters

Here at Monetology we use MongoDB for most of our storage needs.  I'm not at all anti-relational and in fact I tend to be suspicious of a lot of the claims that relational DBs "don't scale to today's data" or "weren't designed for the web."  NoSQL isn't a magic pill that suddenly imparts awesome scalability.

Nonetheless, MongoDB has a lot of nice properties that make it a good choice for a variety of applications.  In particular, its easy to understand much of what its doing.  If you have a decent background in distributed systems (I do) and file systems (I don't, but I'm working on that) then MongoDB just make sense much of the time.

Of course, this doesn't stop me from screwing things up...  A few nights ago was one of those times.

Recently I've been familiarizing myself with replica sets, and I tried out the "majority" write concern.  Things worked fine when all mongods were up, but as soon as I would kill one our app would lock up; a new primary would get elected (if necessary) but all writes would hang.  After a lot of banging my head against a wall I realized why: I was only using 2 full replicas and 1 arbiter.  After killing a full replica, although I still had a majority of nodes available (2 out of 3), that's not what the majority write concern means.  Instead, it requires that your write is received by a majority of replicas.  Since I only had 1 full replica still available to receive writes, all writes were blocking as they waited for a second replica to appear.

There are two obvious solutions: either upgrade the arbiter to a full replica, or reduce the write concern from "majority" to "1".  Although one could conceive of some scenario where a majority write concern wouldn't be totally pointless with a replica set containing an arbiter (e.g. maybe a 5-node replica set with 4 full replicas and 1 arbiter), it seems to me like this makes for a slightly confusing combination and I'd advise avoiding it.

Thursday, January 10, 2013

Maiden Voyage

I've been working at Monetology, an 8-person startup, for about 6 months now.  Its been a great experience in many ways and I'm learning a lot about running systems "in the real world."  I started down that road during my time at Google, but unfortunately our project was cancelled mere weeks before we were scheduled to go live[1] so I missed out on the most exciting stuff.

But now with Monetology I've been able to pick up and learn a bunch of new technologies, and as we move closer to a product launch I'm looking forward to the on-the-job experience that comes from running a live service.  I'll try to capture a few of these discoveries and lessons here.

So I think its time to start one of these "web-logs" that I've been hearing so much about...

[1] As it happens, the entire Atlanta engineering office was closed.  But that's a story for another time...