Monday, June 10, 2013

How I use Python at our startup

Things are exciting over at Monetology, as just a few weeks ago we started accepting Beta signups for our first service offering: Homebase (go ahead and click over to check it out - I'll wait).  Now that we have reached this milestone, I've been thinking back on what has allowed us to get to this point.  One thing that I have to credit, personally, is a steady diet of Python.

But I don't want this post to just be a Python love-fest.  It seems that a lot of programming language discussions are of the form "language X is good (or bad, or better/worst than language Y)", without any kind of qualification.  I don't find that particularly useful; there are lots of good languages that are useful in one situation or another, but none that I would use all the time.  So instead, I'll try to describe where I find Python useful.

Where I don't... 

Actually, let me start by answering the opposite question.  Most notably, we don't use Python at all in our main app (the Homebase web service).  Homebase uses Java on the server and JSX on the client (which is basically just Javascript with static typing, but that description undersells its awesomeness).  In a large program like this, static typing makes refactoring much easier; with a dynamically-typed language like Python, we'd be much more cautious about making large-scale refactorings, which in turn would slow our development speed.

Also Java has great tooling, from debuggers, to heap inspectors, to performance analyzers, and the list goes on.  Big complicated apps fail in big complicated ways, and having powerful inspection tools at your disposal can be incredibly powerful at times like these.

And where I do...

1. Where I would otherwise use a shell script.


One of the things Python did right (and, as an aside, Java did wrong) is adhere closely to the typical shell commands and POSIX APIs, particularly via the shutil and os modules.  This makes it really easy to convert from shell scripts to python code.  When calling out to other programs, os.system(...) works in simple cases, or the subprocess module can be used when you need a little more control (e.g. redirecting stdout or stderr).

I still use bash scripts from time to time, but my tolerance is only about 15-20 lines of shell script; beyond that point I am likely to convert the script to python since the conversion is so easy and the resulting program so much easier to manage.

Some examples of my scripts that fall into this category are:
  • cronify.py (updates the crontabs on our servers when they change)
  • mongodump.py (calls mongodump on our database, excluding a few databases that we don't want to back up, tar-gzips the dumpfiles, upload the tarball to Amazon S3, and cleans up after itself)
  • my-ssh (ssh's into the specified server, which requires tunneling through a gateway, using a non-standard ssh port, and other things that I don't care to remember every time)

2. For tasks on remote servers.


Fabric, a library for performing actions over SSH, is one of the many fantastic Python libraries available1.  Here is a simple example of how to use Fabric to copy a script to a remote server and make it executable.


This is invoked from the command line like:
fab -f simple-fabric.py -H user@server0.company.com deploy

And the output looks like:
[user@server0.company.com] Executing task 'deploy'
[user@server0.company.com] put: server.py -> /tmp/server.py
[user@server0.company.com] run: chmod 755 server.py

Done.
Disconnecting from user@server0.company.com... done.

Want to execute this over multiple servers?  Just add more identifiers to the -H argument and it will deploy to all of them:
fab -f simple-fabric.py -H user@server0.company.com,user@server1.company.com,user@server2.company.com deploy

Even for relatively simple tasks, I find that its worthwhile to make a fabric script to run the commands rather than manually SSHing in and running on the command line.  Firstly, it ensures that afterwards you'll know exactly what commands you ran (because you have them stored in a file on disk), which is obviously useful if things went sideways and you are trying to figure out why.  Secondly, it helps protect against mistakes since you can try out your fabric script on one server, make sure it works, and then execute the exact same commands on any number of other servers.  If you are following a manual script and typing in each command, you run the risk of fat-fingering or omitting a step.

I could say a lot more about the awesomeness of fabric, but in the interest of (semi-)brevity I'll leave just one more tip.  The example above showed how to run the fabric program; the python code we wrote was just a script that was executed by fabric.  This is fine for little scripts, but for larger programs it can be useful to embed fabric as a library.  Here's an example of how to do that:

You can execute this as follows (without using the fab binary):
./embedded-fabric.py user@server0.company.com [...]


3. For internal "mini-servers".


I've used Python to create various small-ish, internal-only services that satisfy some of our ops needs.  There are a whole bunch of popular web frameworks in python but I am partial to webapp2 for a few reasons:
  1. Its the default web framework on AppEngine, and thus using webapp2 everywhere makes it easier to move scripts onto or off of AppEngine.
  2. It doesn't "take over" your program.  I hate frameworks where you just provide some handlers and the framework magically does all the rest (which is hard/impossible to customize or to debug when things go wrong).  Python in general isn't as bad at this as Java ("just add your filter to the web.xml and I'll take care of the rest!" bleh) but in all cases I prefer systems where I explicitly create and then start some kind of server object.
  3. (minor) It avoids decorators, which I have a personal distate for.

Here's the basic template that I start with whenever I create a new web server:

Note that webapp2 does not have a built-in webserver (webapp2 is just in charge of defining URL routes and the associated handlers to be invoked), but paste provides a web server implementation that you can drop it pretty easily (as above).

Here is a modification that shows (a) how to serve static content under a directory and (b) how to share objects between handlers via the app registry:


So far, I've used this webapp2+paste stack to create the following Homebase.IO web servers:

  • ekg - Regularly pings our web servers to ensure they are up and serving, alerting us (via OpsGenie) if too many requests in a row fail.  Runs on AppEngine.
  • Conan the Deployer - Receives build packages from our buildbot instance and deploys them (via fabric) to one of our server environments (e.g. staging, production).
  • Homerview (Homebase + Overview) - Simple, high-level dashboard that aggregates a bunch of signals about each service and heuristically distills them down into a happy, neutral or sad face, depending on the perceived service health (or a skull if the service isn't up at all).

And there we have it... three ways that Python helps me get my job done and (I hope!) indirectly helps make Homebase.IO awesome.  Feel free to disagree with me in the comments, or if you want to hear more of my yammering you can follow me on Google+ or Twitter.



1 As an aside, Python's mature ecosystem of core and third-party libraries is one of the reasons that I still favor Python over Go. Go seems like a very promising language and several of my coworkers at Monetology have had great experiences using it, but a natural consequence of its youth is the relative lack of libraries.

Sunday, January 13, 2013

MongoDB: Avoid the "majority" write concern when using arbiters

Here at Monetology we use MongoDB for most of our storage needs.  I'm not at all anti-relational and in fact I tend to be suspicious of a lot of the claims that relational DBs "don't scale to today's data" or "weren't designed for the web."  NoSQL isn't a magic pill that suddenly imparts awesome scalability.

Nonetheless, MongoDB has a lot of nice properties that make it a good choice for a variety of applications.  In particular, its easy to understand much of what its doing.  If you have a decent background in distributed systems (I do) and file systems (I don't, but I'm working on that) then MongoDB just make sense much of the time.

Of course, this doesn't stop me from screwing things up...  A few nights ago was one of those times.

Recently I've been familiarizing myself with replica sets, and I tried out the "majority" write concern.  Things worked fine when all mongods were up, but as soon as I would kill one our app would lock up; a new primary would get elected (if necessary) but all writes would hang.  After a lot of banging my head against a wall I realized why: I was only using 2 full replicas and 1 arbiter.  After killing a full replica, although I still had a majority of nodes available (2 out of 3), that's not what the majority write concern means.  Instead, it requires that your write is received by a majority of replicas.  Since I only had 1 full replica still available to receive writes, all writes were blocking as they waited for a second replica to appear.

There are two obvious solutions: either upgrade the arbiter to a full replica, or reduce the write concern from "majority" to "1".  Although one could conceive of some scenario where a majority write concern wouldn't be totally pointless with a replica set containing an arbiter (e.g. maybe a 5-node replica set with 4 full replicas and 1 arbiter), it seems to me like this makes for a slightly confusing combination and I'd advise avoiding it.

Thursday, January 10, 2013

Maiden Voyage

I've been working at Monetology, an 8-person startup, for about 6 months now.  Its been a great experience in many ways and I'm learning a lot about running systems "in the real world."  I started down that road during my time at Google, but unfortunately our project was cancelled mere weeks before we were scheduled to go live[1] so I missed out on the most exciting stuff.

But now with Monetology I've been able to pick up and learn a bunch of new technologies, and as we move closer to a product launch I'm looking forward to the on-the-job experience that comes from running a live service.  I'll try to capture a few of these discoveries and lessons here.

So I think its time to start one of these "web-logs" that I've been hearing so much about...

[1] As it happens, the entire Atlanta engineering office was closed.  But that's a story for another time...