Bubbles! now uses Google App Engine

Just for kicks I decided to port the backend services for my Google Android app, Bubbles!, to use the Google App Engine.

For those of you who don’t know Googles App Engine is a “free” application hosting environment. It promises to be able to scale Google style (as long as you pay of course).

The free account gives you 500Mb of persistent storage and bandwidth and CPU for around 5million pages views a month which is not bad for free.

Python

For some strange reason Python is the language of choice for app engine. I’ve never really taken to Python and I don’t really like languages where indentation is syntactically meaningful but it wasn’t too painful after skimming some tutorials on the web and running through the tutorial.

SDK

The SDK is only a couple of megabytes to download (you need Python 2.5 installed) and ran fine on my Windows XP desktop and Ubuntu laptop. Having said that the SDK doesn’t give you very much in the way of an IDE, it just gives you a dev web server and a tool to upload your application to the hosting platform. So I just used gedit and Notepad++ (which both have Python syntax highlighting) as my editors.

Platform Features

App Engine provides a pretty easy to use framework for building web applications in Python. The engine is WSGI compliant so you can plugin in any of the common Python frameworks such as Django, CherryPy, Pylons and web.py. Django seems to be the web app framework of choice.

Apart from a web application framework, the engine also provides APIs for email, image manipulation, URL fetching, users and datastorage.

The Users API is pretty cool as it hooks into Google’s user accounts so anyone with a Google account can log into your application (if you want them too).

The datastore is an object based transaction engine with a SQL like syntax. On the face of it is very clean and easy to use but its here that I had the biggest headaches when porting the Bubbles! services.

Porting the services**

Bubbles! uses 3 very simple services: popin, popout and getpops (where a POP is a point of presence). These services take input parameters from the query string and return JSON response strings.

Creating a class to represent a Pop in the datastore was very simple; as was creating and deleting Pops in the datastore. The biggest issue I had here was coming up with an elegant way of validating the input parameters.

But when I came to getting things back out of the datastore things went a little pear shaped…

In my Pop class I was storing the latitude and longitude of the POP as floating point numbers. To retrieve the nearby POPs (in the getpops service) I was using a typical SQL like query as follows:

SELECT * FROM POP WHERE lat >= :1 AND lat < :2 AND lng >= :3 AND lng < :4 ORDER BY lastdatemodified

Where :1, :2, :3 and :4 were set to currentLat-0.001, currentLat+0.001, currentLng-0.001 and currentLng+0.001._

This raised the first issue: Only one “property” can have an inequality clause in a query. Turns out that the datastore has some pretty weird and wonderful restrictions, of course this particular one put a major spanner in the works.

Geohash to the rescue

What I needed was a quick way of calculating if a point in space was close to another one and to be honest the approach I was never happy with the approach I used above because it found points in rectangular area not a circular one.

After a little bit of research (I love the web), I came across this concept: the Geohash.

This cool (public domain) algorithm takes a decimal lat/long and turns it into a string. For instance

-36.843480 174.767138

Becomes:

rckq2uve1mx3

Not only does this give you something that you can stick on a #aliases: http://geohash.org/rckq2uve1mx3 but more importantly for points near to each other the first few characters of the hash are the same !

-36.843480 174.767138 = rckq2uve1mx3 -36.844381 174.765611 = rckq2usmvvsd -36.848508 174.765451 = rckq2gumfhjr -36.848457 174.748261 = rckq27zy1tg8

So the solution to my problem turns out to be remarkably easy:

When creating or updating a POP I calculate the Geohash for the longitude and latitude of the POP. I take only the first 6 characters and store them with the POP
When processing getpops, I calculate the Geohash for the current latitude/location, grab the first 6 characters and find any pops in the datastore that have the same stored Geohash.

My select statement is now a simple equals and runs far faster then my original implementation.

Summary

Overall Google App Engine is an interesting platform and baring some quirks it does appear to be a viable platform for building web applications on.