CNK's Blog

SQLAlchemy from an ActiveRecord Perspective

I have always started any project from the database perspective. The whole test-first/behavior-driven-development movement has led me to think I would be better off starting from the user interface and throug the stack to the back end. But for me the data model layer is always the easiest. And I actually kind of like SQL so the ORM claims that “you’ll never have to write SQL again” were not a strong selling point - at least not for me. For me the biggest selling point for ActiveRecord (the first ORM I really used) was that I didn’t have to write all the getters/setters in the Ruby side. That and the ease of grabbing related objects and getting them as real objects - not as rows containing attributes from both tables. All the Ruby metaprogramming magic that gives you Foo.find_by_name is nice and makes for very readable code, but that is merely a convenience for me, not a necessity like the object creation feature.

My group at work is in the process of moving from a very simple home grown…. I am not sure it is actually a full blown ORM… perhaps automated query builder? We are a python shop and have decided to adopt SQLAlchemy (currently version 0.7.2). I am in the process of reading the tutorial for the first time and wanted to record my initial impressions.

For starters, as you might expect from the difference in Ruby vs. Python cultures, SQLAlchemy is more explicit than ActiveRecord. SQLAlchemy has what is called a “declarative” mode where you define your class and your table all at once (within a class that inherits from SQLAlchemy’s declarative_base class). But you can also explicitly create, edit, and inspect an explicit metadata object which defines the table. Then you can define a Python class for your object - and then use SQLAlchemy’s mapper to introduce the two.

Non-database attributes

How do the two ORMs deal with model attributes that do not have corresponding database columns? For example, user models commonly ask for a password and a password confirmation in a user form. But what is generally stored in the database is an encrypted form of the password (sometimes called encrypted_password). In Rails/ActiveRecord, you generally create the database migration with the columns that should be in the database, then add password and password_confirmation attributes to the model using attr_accessible. And then you create a "before_save" filter that encrypts the password and stores it in the encrypted_password field. In SQLAlchemy, you can create tables and model classes with different attributes. The mapper step matches the attributes with the same names AND appears to have added an "encrypted_password" attribute to the python object - even though it was not defined in the python class.

Commit behavior

With SQLAlchemy it seems you always have to explicitly commit to get changes written to the database. I have gotten used to ActiveRecord’s mix of implicit and explicit writing to the database. ClassName.create() automatically saves - but ClassName.new() does not. Updates to an ActiveRecord object usually need to be explicitly saved to write to the database. There are some exceptions, for example, when you assign an object to a has_one association, the parent object is automatically saved (in order to update its foreign key). Reading the ActiveRecord docs it sometimes sounds confusing, but in practice, it usually behaves as I want it to. On the other hand, SQLAlchemy’s requirement for an explicit session.commit() also means it is easier to intervene with an explicit session.rollback() if you decide you don’t want to persist the changes you have made to your Python object.

Data Definition

SQLAlchemy has nice, explicit syntax for defining your schema but still in a database agnostic way. Column('name', String(30)) is a pretty easy mental mapping from name varchar(30). The ActiveRecord equivalent, t.string :name, :limit => 30, isn’t bad but isn’t superior either.

Query syntax

The new Rails3 syntax - with chained method calls rather than a hash of options - looks more like SQLAlchemy than it used to. And I suspect they may behave more similarly too - building up a query from pieces and then executing it. But just the fact that the SQLAlchemy tutorial covers how to use subqueries shows that writing SQLAlchemy queries is closer to real SQL than writing ActiveRecord is.

If you have a particularly tricky set of SQL, both ORMs let you create your own SQL. In ActiveRecord this is done with the find_by_sql method. In SQLAlchemy, it is done using from_statement:

    session.query(User).from_statement("SELECT * FROM users where name=:name").params(name='ed').all()

SQLAlchemy can return an iterator or an object or a list. session.query(PythonClassName).filter(<conditions) returns an iterator that will feed you instances of PythonClassName in a loop. You can also return objects and list of objects by setting the query to session.query and then calling query.one(), query.first(), or query.all(). The query.one() behaves rather like ActiveRecord’s ClassName.find()</code> in that it wants exactly one result row and will throw and exception if the return is not exactly one row.

Relationships

Both ORMs have syntax for defining relationships. The ActiveRecord syntax is very English-like: has_one :foo, belongs_to :bar, has_many :widgets. The SQLAlchemy syntax is not as streamlined

  • but it may make it easier to do some explicit joining. ActiveRecord is fabulous - unless your schema doesn’t want to use its naming conventions. There are, in theory, modifiers to the relationship defining methods that are supposed to let you get around that. But in practice, at least as of Rails 2.3, they don’t always work consistently and you can get a ways into your code before discovering, for example, that the cascading delete of related objects doesn’t work because the generated sql is looking for the column “id” rather than “foo_id”. In SQLAlchemy the syntax is:
    class User(Base):
        addresses = relationship(Address, order_by=Address.id, backref='user')

    class Address(Base):
        user = relationship(User, backref=backref('addresses', order_by=id))

Many to many relationships are supported in both ORMs and both make a distinction between m-to-m relationships with boring association tables (which only contain the 2 foreign key columns) and richer associations where you store additional data on the association (for example audit information about when it was created, and by whom). In ActiveRecord, those two cases are :has_and_belongs_to_many and :has_many_through. In SQLAlchemy, the simple case is taken care of by adding "secondary="</code> to the relationship definition. The richer case is taken care of by creating an association object and then establishing relationships between each object and the association object. The example in the association documentation isn't super clear (I think calling the relationships left and right make the names look like keywords). But [this example] (http://www.preetk.com/node/sqlalchemy-part-2-declarative-bi-directional-association-classes/) using a standard permission setup makes better sense.

Both ORMs support lazy loading of related objects by default but can be asked to do eager loading if you ask it to using joinedload() (SQLAlchemy) or .include() (ActiveRecord)

Web form building

Rails’ ActiveSupport provides form building helpers to take the tedium out of building data entry forms. The equivalent in the Python/SQLAlchemy world is provided by FormAlchemy. I can’t say I really like the way FormAlchemy does data validation. I found it rather difficult to figure out how to do custom validations - especially those where the requirements for one field depend on the value in another field. In ActiveRecord this is straightforward.