Hi folks, I have just finished creating my first CakePHP component and I wanted to share it with the community. After I finish writing this article, I will be submitting it to CakePHP for inclusion.
The component is called StringExtractComponent. As you may guess, it extracts content from a piece of content. I have created this component to mimic the STREXTRACT function in Microsoft Visual FoxPro. I have come to fall in love with this function because it makes retrieving data from a string with a start and end delimiter.
To help you understand how it is used, I read the contents of one of my articles into a string. I then proceed to extract a specific section that is contained within the code syntaxes that exist in the article.
Download: http://www.endyourif.com/files/string_extract.zip
The StringExtract Component:
app/controllers/component/string_extract.php
<?php
/***************************************************
* StringExtract Component
*
* In Visual FoxPro there is an excellent function called STREXTRACT
* that allows you to pass in a start and optional end delimiter
* and it returns the text between the start and end delimiter.
*
* This is a PHP version of the same function to save time having
* to find and substring it each time.
*
* @copyright Copyright 2009, Jamie Munro
* @link http://www.endyourif.com
* @author Jamie Munro
* @version 1.0
* @license MIT
*/
class StringExtractComponent extends Object {
/*************************************************
* str_extract parses out and returns the text specified between start and end delimiters
*
* @param string $content String containing the content to parse
* @param string $begindelim String containing the text of where to start parsing
* @param string $enddelim String containing the text of where to end parsing
* @param int $occur Integer defining which instance of the $begin parameter
* @return string
* @access public
*/
function str_extract($content, $begindelim, $enddelim = "", $occur = 0) {
$parsedContent = "";
// don't bother doing any work if content is empty
if (strlen($content)) {
$count = 0;
$start = 0;
// start a loop for the occurance
do {
// get the starting position of the content
$start = strpos($content, $begindelim, $start);
// don't bother doing any more if start was not found
if ($start === false) {
break;
} else {
// since we found it, we want to add the length
// of the begin delimiter to it so it doesn't get
// included when we parse the string
$start += strlen($begindelim);
}
$count++;
} while ($count <= $occur);
// if start is false, we didn't find it, so we should not parse anything
if ($start !== false) {
// if $end is nothing, set the end of the parsing to the length of the content
$end = (bool)false;
if (strlen($enddelim)) {
// find the end delimiter
$end = strpos($content, $enddelim, $start);
}
// if enddelim was not found or not provided set the end to the length of the content
if ($end === false) {
$end = strlen($content);
}
// now we have the start and end, parse it out
$parsedContent = substr($content, $start, $end - $start);
}
}
return $parsedContent;
}
}
?>Here is an example of how to use the component:
<?php
class TestsController extends AppController {
var $uses = array();
var $components = array('StringExtract');
function index() {
$content = file_get_contents('http://www.endyourif.com/the-importance-of-database-indexing/');
$start = '<pre class="php" style="font-family:monospace;">';
$end = '</pre>';
$output = $this->StringExtract->str_extract($content, $start, $end, 3);
echo $output;
exit;
}
}
?><ol>
<li style="font-weight: normal; vertical-align:top;">
<div style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">ALTER TABLE `users` ADD INDEX ( `email` )</div>
</li>
</ol>


For parsing html pages i use XPath. It is very powerfull.
http://www.w3schools.com/php/func_simplexml_xpath.asp