15 Mar 2009 @ 2:26 PM 

On some of our more recent projects, we have been using the Containable behavior that CakePHP provides.  It works quite nicely, a lot easier than the previous format of having to bind and unbind models on the fly.

The pitfalls began to appear when some of our clients were reporting that some pages are extremely slow.  After a bit of research, it was quite apparent that the root of the cause was attributed to us using the containable feature.

To help you understand, let me provide a scenario.  Users can create an album.  Albums contain pictures.  Pictures contains comments.  In our model, each association is set to so User hasMany Album.  Album hasMany Picture.  Picture hasMany Comment.

Using this relationship we can do a find(‘all’) on User and have it contain the data it wants.  For example:

<?php
$users = $this->User->find('all', array('contain' => array(
                    'Album' => array(
                                           'Picture' => array(
                                                                   'Comment'
                                                                   )
                                             )
                             )
                    )
      );
?>

The above statement will return us all users, albums, photos, and comments for those photos in a nice recursive array.  Now you are asking, so what are the pitfalls?  Well, let me clarify, there are no pitfalls with the data returned, it compilies it quite nicely, it’s how it gets there.

To achieve the data above, CakePHP will do one query that selects * from users and left joins albums.  For each album returned in the above result, it will in turn do one select * from pictures where the album_id is in the pictures table.  Then for those results, it will do the same for the comments table.  This can turn into 100s over queries.

Not only does it perform an IN statement, it performs an extraordinary amount of additional queries.  I can be ok with the additional queries, as there are a lot of arguments that support smaller, simpler queries are just as effective as less, larger more complex queries.

There is simply no argument for the IN statement.  The IN statement is far slower than if we were to perform one query using left or inner joins to return the same data, especially on large tables.

After doing some research, it is clear that CakePHP will not be solving this issue in 1.2.x.  I have heard rumors that it will be addressed in version 2 though, which I will be quite excited for.

My advice, avoid using contains when joining more than one table.  Instead write a custom query in your model.  To give you an idea of how simple it is, the contains statement above could have been written like:

51&q=SELECT&lr=lang_en">SELECT `User`.*, `Album`.*, `Picture`.*, `Comment`.*
51&q=FROM&lr=lang_en">FROM
            users `User`
51&q=INNER&lr=lang_en">INNER 51&q=JOIN&lr=lang_en">JOIN
           albums `Album` 51&q=ON&lr=lang_en">ON `User`.id = `Album`.user_id
51&q=INNER&lr=lang_en">INNER 51&q=JOIN&lr=lang_en">JOIN
           photos `Photo` 51&q=ON&lr=lang_en">ON `Album`.id = `Photo`.album_id
51&q=INNER&lr=lang_en">INNER 51&q=JOIN&lr=lang_en">JOIN
          comments `Comment` 51&q=ON&lr=lang_en">ON `Picture`.id = `Comment`.picture_id

This will accomplish the same thing, it is important to note though that your array will be structured different.  It will be a flat array oppose to a recursive array.

Don’t worry CakePHP, I still <3 you!

  • Share/Bookmark

Other useful articles

Posted By: Jamie
Last Edit: 15 Mar 2009 @ 02:26 PM

EmailPermalink
Tags


 

Responses to this post » (13 Total)

 
  1. Leonardo says:

    Do i have to set “recursive”=>2 for works well?!

  2. Jamie says:

    I believe so. This seems to be the default setting on all of the projects that I have worked with, so I’ve only had to turn it down from 2.

  3. Abba Bryant says:

    I just checked and for me at least setting both recursive and contains is not needed. It seems the containable behavior sets it for you.

    also a more elegant method for the same results would be do to some ad-hoc joins on the model->find as per http://bakery.cakephp.org/articles/view/quick-tip-doing-ad-hoc-joins-in-model-find


    $data = $this->User->find( 'all', array( 'joins' => array(
    array(
    'table' => 'albums' ,
    'alias' => 'Album' ,
    'type' => 'inner' ,
    'foreignKey' => 'user_id' ,
    ),
    array(
    'table' => 'photos' ,
    'alias' => 'Photo' ,
    'type' => 'inner' ,
    'foreignKey' => 'album_id' ,
    ),
    array(
    'table' => 'comments' ,
    'alias' => 'Comment' ,
    'type' => 'inner' ,
    'foreignKey' => 'photo_id' ,
    ),
    )));

    $data = $this->User->find( 'all', array( 'joins' => array(
    array(
    'table' => 'albums' ,
    'alias' => 'Album' ,
    'type' => 'inner' ,
    'foreignKey' => false ,
    'conditions' => array(
    'User.id' => 'Album.user_id' ,
    )
    ),
    array(
    'table' => 'photos' ,
    'alias' => 'Photo' ,
    'type' => 'inner' ,
    'foreignKey' => false ,
    'conditions' => array(
    'Album.id' => 'Photo.album_id' ,
    )
    ),
    array(
    'table' => 'comments' ,
    'alias' => 'Comment' ,
    'type' => 'inner' ,
    'foreignKey' => false ,
    'conditions' => array(
    'Photo.id' => 'Comment.photo_id' ,
    )
    ),
    )));

    Both of these options *are untested* and may not work perfectly. I suggest you tweak the order of the conditions, or combine the foreignKey = ‘foreignTable_id’ and the conditions array if you run into trouble. The SQL debugger is very useful.

  4. saintberry says:

    The Containable behaviour is a joke. Couldn’t believe it when I found this out a few months ago. No idea why they include it in the core…

    Personally I avoid writing custom SQL as much as possible. Being database agnostic is a huge advantage of any framework that implements a decent database abstraction layer (as Cake does), so I would only lose that as a very last resort (you ever had a requirement to switch from MySQL to PostgreSQL in production? I have, it’s a pain!).

    Luckily there is an alternative to both solutions outlined in this article. Rafael Bandeira has created the Linkable behaviour (http://rafaelbandeira3.wordpress.com/2008/11/16/linkable-behavior-taking-it-easy-in-your-db/) which basically replicates what the Containable behavior can do but using joins. The interface is much the same as Containable so usually it’s a seamless drop in. There are a few limitations, but not many. Check it out.

  5. Jamie says:

    Thanks very much for the feedback guys. I will definitely check out the linkable behavior.

  6. wow gold says:

    Good post,This was exactly what I needed to read today! I am sure this has relevance to many of us out there.

  7. flyff gold says:

    Good article – plenty of food for thought.

  8. OJ Tibi says:

    Good point, although I still prefer to use Containable just because it paginates easier than that created from a flat query, well at least if I don’t convert it into an associative array.

  9. Metin2 yang says:

    I believe so. This seems to be the default setting on all of the projects that I have worked with, so I’ve only had to turn it down from 2.

  10. Axel says:

    helped me a lot, thank you!

  11. flyff money says:

    nice post, thank you

  12. Mihai says:

    @saintberry You don’t have to let your self in the hands of ORM. If you want to build reliable apps, you must know SQL as well.

  13. Adnan says:

    thanks a lot to @Abba Bryant. The code in the original post is not readable. I took the code from your comment and it worked great.

Post a Comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>


 Last 50 Posts
 Back
Change Theme...
  • Users » 14
  • Posts/Pages » 75
  • Comments » 231
Change Theme...
  • VoidVoid « Default
  • LifeLife
  • EarthEarth
  • WindWind
  • WaterWater
  • FireFire
  • LightLight