dlo.me

Random Queries on the Appengine Datastore

First, this post is 100% programming related. So if you don't care about programming, just walk away.

Chances are that if you're a Python or Java developer, you've run into Google Appengine at one point or another. A perennial issue with the datastore is that there's no officially documentation on how to pull random entities from your database. Well friends, I'd like to let you in on a little secret for how you can do this.

It's quite simple, actually. Let's say you have a simple model, called Users. E.g.

from google.appengine.ext import db

class User(db.Model):
    first_name = db.StringProperty()
    last_name = db.StringProperty()
    email = db.EmailProperty()

Make another model called UserCount (this model will only have one entity, and the count will be equivalent to the number of Users that have been inserted into the database). Instantiate a UserCount entity with an initial count of 0 before you do anything.

class UserCount(db.Model):
    count = db.IntegerProperty(default = 0)

Simple so far. Do something similar to this when you insert a User into the database.

uc = UserCount.get_by_key_name("1")
uc.count += 1
uc.put()

u = User(key_name = str(uc.count), first_name = "Steve", last_name = "Jobs",
        email = "[email protected]")

To get random data, use Python's random module, and call the User's by their key names. Caching is recommended. You may want to do something similar to the below.

from random import sample
from google.appengine.api import memcache

uc = UserCount.get_by_key_name("1")
N = 5

ids = sample(range(1, uc.count+1), N)

users = []
for id in ids:
    user = memcache.get("user_%d" % id)
    if not user:
        user = User.get_by_key_name(str(id))
    users.append(user)

And that is how to get random entities from the Google Appengine datastore.


Originally posted at http://dloewenherz.blogspot.com/2010/01/querying-for-n-random-entities-using.html