Bulk upload data when using app-engine-patch

I’ve been interested in learning more about both Django and Google’s App Engine, so I’ve been playing with app-engine-patch, a project that simplifies using Django on App Engine. I like it so far; it seems pretty simply, and it’s really easy to just extract the sample project and start modifying it.

Many of the personal projects I’ve been working on have needed some kind of ‘preset’ data. E.g. makes of cars, categories of food, etc. Data that won’t change frequently, but should most likely live in a database. It probably *could* just permantly live in a database, but since I’d rather not do that since I haven’t yet committed to a schema (or really even framework!). I’d prefer to keep the data in a simple external file like a spreadsheet if possible.

As it turns out, App Engine has the built-in Bulk Data Uploader which greatly simplifies loading data from a csv file. I couldn’t find any webpages mentioning using this in conjunction with app-engine-patch, so I thought I’d give it a shot.

Happily, Google’s instructions work pretty-much unmodified with app-engine-patch. Nevertheless, I created a small sample which can be extracted to the app-engine-patch sample project and demonstrates bulk uploading. Download here.
UPDATE 5/19/09: Now that app-engine-patch 1.0 is stable, download the updated example here. A readme is included which states exactly what to do.

Now I can store all of my data in a plain csv, and simply bulk load it into my project when necessary!

WARNING: This sample doesn’t place any security restrictions on uploading. If you run this code unmodified on your production site, anyone will be able to post data to it!

NOTE: If you’re using the repository/1.0 version of app-engine-patch, you’ll need to modify the sample a little. That version prepends every model name with the name of the app. So instead of bulk loading data into ‘Person’, you’ll have to load it into ‘myapp_Person’.
UPDATE 5/19/09: You shouldn’t need to change anything when using my newer example; it should work out-of-the-box with app-engine-patch 1.0/1.1beta

6 Responses to “Bulk upload data when using app-engine-patch”


  1. 1Ryan Pendergast

    This did not work for me – I’m using App eninge 1.2.2 and django patch 1.1 beta 1. The load is inserting data into myapp_Person.

    I see you note about the repository/1.0 version. I have tried to specify ‘myapp_Person’ on the -kind param of the bulkload_client.py command but I get the following error:
    no Loader defined for kind myapp_Person.

    any ideas whats going on?

  2. 2jeff

    Hi Ryan, thanks for commenting!

    You’re right, in my original instructions I forgot to mention that you need to prepend ‘myapp’ to both the kind AND the loader class in myloader.py

    Now that 1.0 is out and stable, I’ve updated my example to work without any modifications. You can follow the new link above, or grab it straight from here.

    Let me know if you have any more issues with it!

  3. 3Ryan Pendergast

    Thanks – that works now. Why do you use bulkload_client.py over appcfg.py upload_data and the bulkloader? Reason I’m asking is I can not get bulkloader to work with the app-engine-patch. Would be nice not to have to define a different url for each load I wanted to do, but instead just use ‘appcfg.py upload_data’ and specify which loader to use (see loader class example here: http://code.google.com/appengine/docs/python/tools/uploadingdata.html).

  4. 4jeff

    I use bulkload_client.py because when I first created this tutorial, “appcfg.py upload_data” didn’t exist yet. I haven’t tried using it either, so unfortunately I can’t give any tips on using it.

    Lately I’ve resorted to created my own django views for handling imports, mostly because I need to import complex relational data, and as far as I can tell the google uploader doesn’t support relations of any kind.

  5. 5Michel

    Hello,

    Thanks for your contribution!

    “the google uploader doesn’t support relations of any kind” what do you mean?

    You can use the google uploader to pass a field or a key_name and than build the relation during the upload process:

    from dateutil import parser
    from google.appengine.ext import db
    from google.appengine.tools import bulkloader

    from common.appenginepatch.aecmd import setup_env
    setup_env(manage_py_env=False)

    from django.contrib.auth.models import User

    import goals.models

    def findUserKeyByKeyName(keyName):
    # query = User.all()
    # query.filter(‘title =’, title)
    # theUser = query.get()
    theUser = User.get_by_key_name(keyName)
    return theUser.key()

    class GoalLoader(bulkloader.Loader):
    def __init__(self):
    bulkloader.Loader.__init__(self, ‘goals_goal’,
    [('title', str),
    ('description', str),
    ('date', lambda x: parser.parse(x).date()),
    ('author', findUserKeyByKeyName)
    ])

    loaders = [GoalLoader]

    # csv file:
    #g3,description,03-03-2006,userx
    #g4,dosm,1950-02-02,usery

    Hope it helps,

    Michel

    PS: This doesn’t mean that the loader is the best way… I will also think a bit about generating automatically more complete test data based on combinations of interesting values for the differents entities…

  6. 6Martin

    Thank you soo much…

Comments are currently closed.