Yahoo! Search BOSS Integration with CakePHP: CakePHP, Yahoo, API

Yahoo! Search BOSS Integration with CakePHP

Tags: CakePHP, Yahoo, API | Written on 17/7/08

CakePHP Logo Yahoo Search Boss Description

I implemented full text search with the Yahoo! Search BOSS API on top of CakePHP in one night, ending at 5AM - lol.  It has auto-suggest for spelling errors, and I added paging as well. The BOSS API impressed me, it was very clean and I was able to implement seamless full text search into my website.

The full source is at the bottom of the article, but let me walk you through the basic parts.

Start by gathering the results. I'm sure there is a better way to do this in CakePHP, but for now I'm using straight PHP5.  I got most of this off Webmaster-Source, but I added a few things including: setting the site search context site:domain.com as well as getting the total results found totalhits.  You can see all the xml data fields that you get back on the search response fields page.

Here is the code for gathering the search results, make sure you get $search_term from the params, then switch out the BOSS API key your_API_key and domain.com with yours. You get the API key by registering on the Yahoo! BOSS website.

PHP:
  1. //Gather data and prepare query
  2. $thequery = urlencode($search_term);
  3. $yhost = 'http://boss.yahooapis.com';
  4. $apikey = 'your_API_key';
  5. $site = 'site:yourdomain.com';
  6. $url = "$yhost/ysearch/web/v1/$thequery+$site?appid=$apikey&format=xml";
  7. // if (!empty($start)) $url .= "&start=$start"; Add this to provide paging
  8.  
  9. //Get the search results
  10. $ch = curl_init($url);
  11. curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
  12. curl_setopt($ch, CURLOPT_HEADER, 0);
  13. $data = curl_exec($ch);
  14. curl_close($ch);
  15. $results = new SimpleXmlElement($data, LIBXML_NOCDATA);
  16. $total = $results->resultset_web->attributes();
  17. $total = $total['totalhits'];
  18. $results = $results->resultset_web->result;
  19.  
  20. $yhost = 'http://boss.yahooapis.com';
  21. $url = "$yhost/ysearch/spelling/v1/$thequery?appid=$apikey&format=xml";

We can loop through our results in our view. $results will contain title, dispurl and abstract. I use the shortcut, <? which is shorthand for <?php echo.

PHP:
  1. <? $i=1; foreach($results as $result): if(!empty($result->title)): ?>
  2.     <li>
  3.         <span class="count">#<?= $i++; ?>:</span>
  4.         <? $title = str_replace('"', '"', $result->title); ?>
  5.         <a class="search_result" title="<?= $title; ?>" href="<?= $result->clickurl; ?>">
  6.             <?= $title; ?>
  7.         </a><br />
  8.         <span class="abstract">
  9.             <?= str_replace(
  10.                 array('<b>yourdomain.com</b>', '<b>YourDomain.com</b>', '<b>...</b>'),
  11.                 array('yourdomain.com', 'YourDomain.com', '...'),
  12.                 $result->abstract
  13.             ); ?>
  14.         </span><br />
  15.         <div class="url"><?= strip_tags($result->dispurl); ?></div>
  16.     </li>
  17. <? endif; endforeach; ?>

Next, we want to get any suggestions in case they spelled something wrong.  This is obviously optional, but I think it is a key to search. People are going to misspell things (including me). Using the spelling API, look up the suggested search term.

PHP:
  1. //Get the suggested term
  2. $ch = curl_init($url);
  3. curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
  4. curl_setopt($ch, CURLOPT_HEADER, 0);
  5. $data = curl_exec($ch);
  6. curl_close($ch);
  7. $suggest = new SimpleXmlElement($data, LIBXML_NOCDATA);
  8. $suggest_term = $suggest->resultset_spell->result->suggestion;

Then it is easy to display the suggestion in our view:

PHP:
  1. <? if(!empty($suggestTerm)): ?>
  2.     <strong>Did you mean: <?= $html->link($suggestTerm, '/searches?q=' . $suggestTerm); ?></strong>
  3. <? endif; ?>

I also implemented paging in the view, this is probably pretty crappy code... written very close to 5AM so you may want to do this your own way. It basically uses the $start and $total variables in the URL to calculate the pages, and what page you are on.

PHP:
  1. <?php
  2.     if ($start > 0):
  3.  
  4.         if ($start == 10)
  5.             $startParam = '';
  6.         else
  7.             $startParam = "&start=" . ($start - 10);
  8.        
  9.         echo $html->link('Prev', $self . "?q=$searchTerm" . $startParam) . "&nbsp;&nbsp; ";
  10.     endif;
  11.  
  12.     $page = 0;
  13.     for ($i=0; $i<$total; $i=$i+10) {
  14.         $page++;
  15.    
  16.         if ($i > ($start-50) && $i < ($start+50)) {
  17.             if ($i == $start) {
  18.                 echo " $page ";
  19.             } else if ($i === 0) {
  20.                 echo $html->link($page, $self . "?q=$searchTerm");
  21.             } else {
  22.                 echo $html->link($page, $self . "?q=$searchTerm&start=$i");
  23.             }
  24.         } else if ($i == ($start-60)) {
  25.             echo '... ';
  26.         } else if ($i == ($start+60)) {
  27.             echo '... '; break;
  28.         }
  29.     }
  30.     echo " &nbsp;&nbsp;";
  31.  
  32.     if (count( $results ) === 10):
  33.         echo $html->link('Next', $self . "?q=$searchTerm&start=$next");
  34.     endif;
  35.  
  36.     $countPages = ceil($total / 10);
  37.     echo "&nbsp;&nbsp; ($total results on $countPages pages)";
  38. endif;
  39. ?>

Ok, putting it all together here is the full source.

Full source of controllers/searches_controller.php

PHP:
  1. <?php
  2. class SearchesController extends AppController {
  3.     var $name = 'Searches';
  4.     var $uses = array();
  5.  
  6.         public function index() {
  7.        
  8.         params['url']['q']) ? $this->params['url']['q'] : null;
  9.         $start = isset($this->params['url']['start']) ? $this->params['url']['start'] : 0;
  10.        
  11.         if (!empty($search_term)) {
  12.        
  13.             //Gather data and prepare query
  14.             $thequery = urlencode($search_term);
  15.             $yhost = 'http://boss.yahooapis.com';
  16.             $apikey = 'your_API_key';
  17.             $site = 'site:yourdomain.com';
  18.             $url = "$yhost/ysearch/web/v1/$thequery+$site?appid=$apikey&format=xml";
  19.             if (!empty($start)) $url .= "&start=$start";
  20.  
  21.             //Get the search results
  22.             $ch = curl_init($url);
  23.             curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
  24.             curl_setopt($ch, CURLOPT_HEADER, 0);
  25.             $data = curl_exec($ch);
  26.             curl_close($ch);
  27.             $results = new SimpleXmlElement($data, LIBXML_NOCDATA);
  28.             $total = $results->resultset_web->attributes();
  29.             $total = $total['totalhits'];
  30.             $results = $results->resultset_web->result;
  31.            
  32.             $yhost = 'http://boss.yahooapis.com';
  33.             $url = "$yhost/ysearch/spelling/v1/$thequery?appid=$apikey&format=xml";
  34.            
  35.             //Get the suggested term
  36.             $ch = curl_init($url);
  37.             curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
  38.             curl_setopt($ch, CURLOPT_HEADER, 0);
  39.             $data = curl_exec($ch);
  40.             curl_close($ch);
  41.             $suggest = new SimpleXmlElement($data, LIBXML_NOCDATA);
  42.             $suggest_term = $suggest->resultset_spell->result->suggestion;
  43.            
  44.             $next = $start + 10;
  45.             $this->set(compact('search_term', 'results', 'suggest_term', 'start', 'next', 'total'));
  46.            
  47.         }
  48.     }
  49.    
  50. }
  51.  
  52. ?>

Full source of views/searches/index.ctp

PHP:
  1. <?php if (!empty($searchTerm)): ?>
  2.    
  3. <?php
  4.     $this->pageTitle = "Search results for, \"$searchTerm\"";
  5. ?>
  6.  
  7.     <h2>Search results for, "<?= $searchTerm; ?>"</h2>
  8.     <? if(!empty($suggestTerm)): ?>
  9.         <strong>Did you mean: <?= $html->link($suggestTerm, '/searches?q=' . $suggestTerm); ?></strong>
  10.     <? endif; ?>
  11.     <div class="container search_results">
  12.         <? if(empty($results)): ?>
  13.             No results found for "<?= $searchTerm; ?>".
  14.         <? else: ?>
  15.             <h4>Results <?= $start+1 . '-' . (count($results)+$start) . ' of ' . $total; ?></h4>
  16.             <ul>
  17.                 <? $i=$start+1; foreach($results as $result): if(!empty($result->title)): ?>
  18.                     <li>
  19.                         <span class="count">#<?= $i++; ?>:</span>
  20.                         <? $title = str_replace('"', '&quot;', $result->title); ?>
  21.                         <a class="search_result" title="<?= $title; ?>" href="<?= $result->clickurl; ?>">
  22.                             <?= $title; ?>
  23.                         </a><br />
  24.                         <span class="abstract">
  25.                             <?= str_replace(
  26.                                 array('<b>yourdomain.com</b>', '<b>YourDomain.com</b>', '<b>...</b>'),
  27.                                 array('yourdomain.com', 'YourDomain.com', '...'),
  28.                                 $result->abstract
  29.                             ); ?>
  30.                         </span><br />
  31.                         <div class="url"><?= strip_tags($result->dispurl); ?></div>
  32.                     </li>
  33.                 <? endif; endforeach; ?>
  34.             </ul>
  35.            
  36.             <?
  37.            
  38.             if ($total > 10):
  39.                 if ($start > 0):
  40.                
  41.                     if ($start == 10)
  42.                         $startParam = '';
  43.                     else
  44.                         $startParam = "&start=" . ($start - 10);
  45.                        
  46.                     echo $html->link('Prev', $self . "?q=$searchTerm" . $startParam) . "&nbsp;&nbsp; ";
  47.                 endif;
  48.                
  49.                 $page = 0;
  50.                 for ($i=0; $i<$total; $i=$i+10) {
  51.                     $page++;
  52.                    
  53.                     if ($i > ($start-50) && $i < ($start+50)) {
  54.                         if ($i == $start) {
  55.                             echo " $page ";
  56.                         } else if ($i === 0) {
  57.                             echo $html->link($page, $self . "?q=$searchTerm");
  58.                         } else {
  59.                             echo $html->link($page, $self . "?q=$searchTerm&start=$i");
  60.                         }
  61.                     } else if ($i == ($start-60)) {
  62.                         echo '... ';
  63.                     } else if ($i == ($start+60)) {
  64.                         echo '... '; break;
  65.                     }
  66.                 }
  67.                 echo " &nbsp;&nbsp;";
  68.                
  69.                 if (count( $results ) === 10):
  70.                     echo $html->link('Next', $self . "?q=$searchTerm&start=$next");
  71.                 endif;
  72.                
  73.                 $countPages = ceil($total / 10);
  74.                 echo "&nbsp;&nbsp; ($total results on $countPages pages)";
  75.             endif;
  76.             ?>
  77.         <? endif; ?>
  78.     </div>
  79.    
  80.    
  81.  
  82. <?php endif; ?>

Comments

#1. John S. on 17/7/08
I did a search for jQuery and when I click on the pager links, didn't get any results past the first page. Also, in the search results the green url looks clickable to me, but it's not a link.

Great example though! Thanks
#2. Marc Grabanski on 17/7/08
John S: Thanks for letting me know, it is fixed now. It was a very minor parameter naming issue.
#3. Herus Armstrong on 17/7/08
Omg! Very useful and... omfg how could it be so short code!!? Cake is impressive!

Leave a Comment