On some of our more recent projects, we have been using the Containable behavior that CakePHP provides. It works quite nicely, a lot easier than the previous format of having to bind and unbind models on the fly.
The pitfalls began to appear when some of our clients were reporting that some pages are extremely slow. After a bit of research, it was quite apparent that the root of the cause was attributed to us using the containable feature.
To help you understand, let me provide a scenario. Users can create an album. Albums contain pictures. Pictures contains comments. In our model, each association is set to so User hasMany Album. Album hasMany Picture. Picture hasMany Comment.
Using this relationship we can do a find(‘all’) on User and have it contain the data it wants. For example:
The above statement will return us all users, albums, photos, and comments for those photos in a nice recursive array. Now you are asking, so what are the pitfalls? Well, let me clarify, there are no pitfalls with the data returned, it compilies it quite nicely, it’s how it gets there.
To achieve the data above, CakePHP will do one query that selects * from users and left joins albums. For each album returned in the above result, it will in turn do one select * from pictures where the album_id is in the pictures table. Then for those results, it will do the same for the comments table. This can turn into 100s over queries.
Not only does it perform an IN statement, it performs an extraordinary amount of additional queries. I can be ok with the additional queries, as there are a lot of arguments that support smaller, simpler queries are just as effective as less, larger more complex queries.
There is simply no argument for the IN statement. The IN statement is far slower than if we were to perform one query using left or inner joins to return the same data, especially on large tables.
After doing some research, it is clear that CakePHP will not be solving this issue in 1.2.x. I have heard rumors that it will be addressed in version 2 though, which I will be quite excited for.
My advice, avoid using contains when joining more than one table. Instead write a custom query in your model. To give you an idea of how simple it is, the contains statement above could have been written like:
51&q=SELECT&lr=lang_en">SELECT `User`.*, `Album`.*, `Picture`.*, `Comment`.*
51&q=FROM&lr=lang_en">FROM
users `User`
51&q=INNER&lr=lang_en">INNER 51&q=JOIN&lr=lang_en">JOIN
albums `Album` 51&q=ON&lr=lang_en">ON `User`.id = `Album`.user_id
51&q=INNER&lr=lang_en">INNER 51&q=JOIN&lr=lang_en">JOIN
photos `Photo` 51&q=ON&lr=lang_en">ON `Album`.id = `Photo`.album_id
51&q=INNER&lr=lang_en">INNER 51&q=JOIN&lr=lang_en">JOIN
comments `Comment` 51&q=ON&lr=lang_en">ON `Picture`.id = `Comment`.picture_idThis will accomplish the same thing, it is important to note though that your array will be structured different. It will be a flat array oppose to a recursive array.
Don’t worry CakePHP, I still <3 you!

Categories
Tag Cloud
Blog RSS
Comments RSS
Last 50 Posts
Back
Void « Default
Life
Earth
Wind
Water
Fire
Light 
Do i have to set “recursive”=>2 for works well?!I believe so. This seems to be the default setting on all of the projects that I have worked with, so I’ve only had to turn it down from 2.I just checked and for me at least setting both recursive and contains is not needed. It seems the containable behavior sets it for you.also a more elegant method for the same results would be do to some ad-hoc joins on the model->find as per http://bakery.cakephp.org/articles/view/quick-tip-doing-ad-hoc-joins-in-model-find
$data = $this->User->find( 'all', array( 'joins' => array(
array(
'table' => 'albums' ,
'alias' => 'Album' ,
'type' => 'inner' ,
'foreignKey' => 'user_id' ,
),
array(
'table' => 'photos' ,
'alias' => 'Photo' ,
'type' => 'inner' ,
'foreignKey' => 'album_id' ,
),
array(
'table' => 'comments' ,
'alias' => 'Comment' ,
'type' => 'inner' ,
'foreignKey' => 'photo_id' ,
),
)));
$data = $this->User->find( 'all', array( 'joins' => array(
array(
'table' => 'albums' ,
'alias' => 'Album' ,
'type' => 'inner' ,
'foreignKey' => false ,
'conditions' => array(
'User.id' => 'Album.user_id' ,
)
),
array(
'table' => 'photos' ,
'alias' => 'Photo' ,
'type' => 'inner' ,
'foreignKey' => false ,
'conditions' => array(
'Album.id' => 'Photo.album_id' ,
)
),
array(
'table' => 'comments' ,
'alias' => 'Comment' ,
'type' => 'inner' ,
'foreignKey' => false ,
'conditions' => array(
'Photo.id' => 'Comment.photo_id' ,
)
),
)));
Both of these options *are untested* and may not work perfectly. I suggest you tweak the order of the conditions, or combine the foreignKey = ‘foreignTable_id’ and the conditions array if you run into trouble. The SQL debugger is very useful.
The Containable behaviour is a joke. Couldn’t believe it when I found this out a few months ago. No idea why they include it in the core…Personally I avoid writing custom SQL as much as possible. Being database agnostic is a huge advantage of any framework that implements a decent database abstraction layer (as Cake does), so I would only lose that as a very last resort (you ever had a requirement to switch from MySQL to PostgreSQL in production? I have, it’s a pain!).
Luckily there is an alternative to both solutions outlined in this article. Rafael Bandeira has created the Linkable behaviour (http://rafaelbandeira3.wordpress.com/2008/11/16/linkable-behavior-taking-it-easy-in-your-db/) which basically replicates what the Containable behavior can do but using joins. The interface is much the same as Containable so usually it’s a seamless drop in. There are a few limitations, but not many. Check it out.
Thanks very much for the feedback guys. I will definitely check out the linkable behavior.Good post,This was exactly what I needed to read today! I am sure this has relevance to many of us out there.Good article – plenty of food for thought.Good point, although I still prefer to use Containable just because it paginates easier than that created from a flat query, well at least if I don’t convert it into an associative array.I believe so. This seems to be the default setting on all of the projects that I have worked with, so I’ve only had to turn it down from 2.helped me a lot, thank you!nice post, thank you@saintberry You don’t have to let your self in the hands of ORM. If you want to build reliable apps, you must know SQL as well.thanks a lot to @Abba Bryant. The code in the original post is not readable. I took the code from your comment and it worked great.