postgresql iconcatalyst iconperl iconmysql icontypo iconphp icon

Database Abstraction - code vs infrastructure

Posted in , , , , , , Wed, 05 Sep 2007 04:38:00 GMT

I've worked on a number of database-driven projects and no matter how much people want database abstraction, it was always difficult to code and maintain. I was recently reminded of this when I read this Drupal article on dropping PostgreSQL support. Not only can it be difficult to maintain support for multiple databases, but it may be difficult to find developers.

One solution of modern programming is to move database abstraction from the code to the infrastructure using a ORM (Object-Relational Mapper) or Data Mapper. A ORM and Data Mapper abstracts the database for you so you no longer have to do tie db abstraction to each app. Not only does it let you code once for multiple databases it lets your users migrate their data from one database to another. This blog runs Typo which is based on Ruby on Rails and ActiveRecord. I've been contemplating migrating Typo from MySQL to PostgreSQL and I've been told that it would be as simple as exporting the data with YAML, updating the database.yml file and importing the data. I haven't gotten around to doing it yet but it is a powerful idea. ActiveRecord is a data mapper and isn't as flexible as a full blown ORM but it gets the job done for the most part. For a full-blown ORM, I think of Perl's DBIx::Class which provides a full OO interface to the RDBMS allowing you to code just once for multiple DBs without limiting you when you want to use some esoteric database-specific SQL. DBIx::Class is often used with the Catalyst Framework but is also used by itself.

There are PHP frameworks out there like Symfony and Cake but do any of them have stand-alone ORMs? If so, could Drupal move to something like that and solve their maintainership problems once and for all? Drupal is part of the Go PHP5 effort so there should be no issue using PHP 5 OO. Something to think about for the Drupal folks if a PHP ORM is available.

del.icio.us:Database Abstraction - code vs infrastructure digg:Database Abstraction - code vs infrastructure reddit:Database Abstraction - code vs infrastructure spurl:Database Abstraction - code vs infrastructure wists:Database Abstraction - code vs infrastructure simpy:Database Abstraction - code vs infrastructure newsvine:Database Abstraction - code vs infrastructure blinklist:Database Abstraction - code vs infrastructure furl:Database Abstraction - code vs infrastructure fark:Database Abstraction - code vs infrastructure blogmarks:Database Abstraction - code vs infrastructure Y!:Database Abstraction - code vs infrastructure smarking:Database Abstraction - code vs infrastructure magnolia:Database Abstraction - code vs infrastructure segnalo:Database Abstraction - code vs infrastructure

6 comments

Comments

  1. Gunnar Wolf said about 16 hours later:

    I was starting to write a reply to your comment… But then it grew long, and then I thought I’d post it on my own blog. So you are welcome there ;-) Database abstraction: ORMs and RDBMS-agnostic coding

  2. John Wang said about 23 hours later:

    Hi Gunnar,

    I read your article and while I agree with the gist of your points, I wonder if they apply to Drupal. The first part of your article discusses Pg specific code but is this relevant for Drupal? As for switching to an ORM, would it solve their Postgres developer issue? Here are a few things that come to mind:

    (a) DB-agnosticism: You don’t say if RDBMS-agnosticism is something that is important to you when you talk about using advanced Postgres features. When you use those features, do you also write the same code for MySQL, Oracle, SQL Server, etc.? One of the reasons for using an ORM is that you get RDBMS-agnosticism for free, but that’s a moot issue if you don’t care about RDBMS-agnosticism to begin with.

    (b) DB-affinity: I think it is appropriate for some apps to be very tied to the database, but do you think Drupal is one of them? How many Drupal administrators will be expert DBAs? My feeling is that some apps are perfectly fine to use with ORMs, like blogs and CMS systems. As side note is that Vox uses the Data::ObjectDriver ORM so it’s not just small apps that use ORMs.

    (c) ORM Migration: As for Drupal making a switch to an ORM, then end goal is to eliminate the developer requirement for multiple DBs. If that goal is valid and an ORM is available, I don’t see why they wouldn’t start making a migration. It doesn’t have to be done all at once, e.g. you can have some of the code use raw SQL and some use an ORM during the transition. I’ve worked on projects that used native SQL and an ORM at the same time as well as one that used two different templating systems during the migration. In the end, it was always better when the migration was finally done, but in the mean time, the app was fully functional.

    I still think Drupal is a reasonable candidate for using an ORM to eliminate database-specific code, if a suitable ORM is available. This is because I think most Drupal installations won’t have a lot of functionality pushed into the DB, DB-agnosticism may continue to be a goal, and it may be difficult to find people to support the many DBs out there. With DBIx::Class, you could write a CMS that supports Postgres, MySQL, Oracle, and MSSQL with one set of code. That kind of agnosticism would be pretty cool for Drupal IMO.

  3. Gunnar Wolf said 2 days later:

    (a) Umm… I think the answer to this heavily depends on the task at hand – For simpler projects where the DB is just a data store is where I stick for being RDBMS-agnostic. However, when you start talking about complex relationships and validations which must be ensured at all times (this means, the programmer should not be able to mess with them even with direct access to the DB), RDBMS agnosticism is no longer as important as really working with the full power of your RDBMS.

    (b) I agree with you here, at least on a first approximation (as I’m not familiar with Drupal) – I don’t think Drupal does many advanced things with the DB – Why? Because they chose MySQL. Basically everything you do with MySQL should be abstractable, should be manageable by an ORM. So I agree with you.

    Oh, and no, I’m not implying that only small apps use ORMs – many medium-sized apps I do rely on them. However, I usually have to jump through some hoops to declare my constraints both in the DB and in the model classes, so that the ORM does not end up throwing exceptions in disgust.

    (c) I can only extrapolate… I guess they weighed the choice to gradually swipetheir hand-crafted SQL and go to an ORM or to go the easy way and just drop support for non-MySQL DBs… And the second side’s argument was easiness. It’s not fun to reinvent all and to spot all the bugs that’d bring in, of course…

    I’d love to see Drupal move over to an ORM (mainly because I am strongly more confident in Postgres than in MySQL, and I’m a Drupal user)... But I guess it needs somebody to do it. Are you volunteering? :)

  4. Gunnar Wolf said 2 days later:

    (ugh, I thought your blog would intepret my linebreaks… Anyway, I hope it still is readable :-/ )

  5. John Wang said 2 days later:

    The nice thing about DBIx::Class is that it will recognize constraints in your model classes and create them in the DB. No need to create them separately like with ActiveRecord.

    I’m not sure if they’ve decided to drop Pg support yet. I think that’s just one developer’s suggestion.

    One question is whether there is even a good PHP ORM to move to? And no, I’m not volunteering. I have too many things on my plate as it is but thought it would be a good alternative to simply dropping Pg support.

    I also fixed your post by adding an extra line break. You just need to carriage returns to get it to make a separate paragraph for you :)

  6. Spacemonkey said 7 months later:

    PHP5 has Propel – http://propel.phpdb.org/trac/

    However my complaints with ORMs are that things like foreign constraints need to remain in the database if that database is used by more than one application or platform.

    Case in point, the intranet website (done in one language), which also has some reporting hooks to the HR system (done in another language), and also interfaces with the accounting/finance systems (yet another language).

    In a corporate setting, it is entirely expected to have the website done in one language, the HR system done in VB.Net, and the accounting system done in, well, er, COBOL or some such nonsense ;-)

    With the ActiveRecord approach – at least as how Rails does it at least – you gotta enforce those constraints in each application. Had you simply used foreign keys, that duplication wouldn’t be needed.

    Or am I missing a point here?

(leave url/email »)

   Comment Markup Help Preview comment