ecommerce website design Shropshire, Custom & Ecommerce Website Designers, Web Page Design Company Shropshire UK Site Map
Phone:01743 272 609 Fax: 0709 286 1166
Home page, website design service at Shrewsbury, Shropshire. We supply website design service to business in Shropshire and whole UKAbout spark computing, website design service at Shrewsbury, Shropshire. We supply website design service to business in Shropshire and whole UKExpertise and services from spark computing, website design service at Shrewsbury, Shropshire. We supply website design service to business in Shropshire and whole UKOffice automatic solutions, online business solutions supplied by Spark computing, shrewsbury, shropshirecontact spark computing, shrewsbury, shroshire to discuss online business websites, custom software development

Fast Full Text Search

15/05/10

Permalink 12:56:46 pm, Categories: Website design, General IT articles  

Fast Full Text Search

We have a large database with tens of thousands records. Each record has various text fields that should be searchable. All the records are stored in MySQL.

Initially we use the LIKE “%keyword%", however, a command word such as ‘flower’ search takes 40s and the server load is extremely high. This is not acceptable on a public website.

Secondly, we re-write the codes to use MySQL’s full text search feature. The result is still disappointing as the same search still takes half a minutes.

We discover a new search utility provided by Zend called Zend Lucene Search. We heard about Lucene before as people in Yell.com are using this technology to power their search.

Here is a quick guide on how to get Zend Lucene running for your website:

1. Get & install Zend framework
The minimal version of Zend Framework will do the work well.
http://framework.zend.com/download/current/

Unzip the file and put it under /usr/local/lib/zend

Update your php.ini to add the zend framework in your include directories.

2. Build the index.
In our case, we use DataObject within the Pear library to get records from database and then use Zend Lucend to make index.

PHP:

// Create index
require_once('Zend/Search/Lucene.php');
 
##we want the numeric to be searchable as well
Zend_Search_Lucene_Analysis_Analyzer::setDefault(
   new Zend_Search_Lucene_Analysis_Analyzer_Common_TextNum_CaseInsensitive());
 
$index Zend_Search_Lucene::create($zend_indexPath);
 
 
$photo DB_DataObject::factory('Photo');
$photo->find();
 
 
while($photo->fetch() )
{
    $doc = new Zend_Search_Lucene_Document();
 
    $doc->addField(
        Zend_Search_Lucene_Field::UnIndexed('id'$photo->id));
 
    
    $doc->addField(
    Zend_Search_Lucene_Field::Unstored('code'stripSlashes($photo->code)));
 
    $doc->addField(
    Zend_Search_Lucene_Field::TEXT('name'stripSlashes($photo->name)));
 
    $doc->addField(
    Zend_Search_Lucene_Field::Unstored('detail'stripSlashes($photo->detail)));
 
    $doc->addField(
    Zend_Search_Lucene_Field::Unstored('keywords'stripSlashes($photo->keywords)));
 
    $doc->addField(
        Zend_Search_Lucene_Field::UnIndexed('url'"domain/index.php?id=$photo->id"));
    
 // Add document to the index
    $index->addDocument($doc);
}
 
$index->commit();
 
// Optimize index.
$index->optimize();

3. Update the index
You can run the above codes in the cron job daily to keep your index uptodate.
You can also make finer control by removing the records from the index and then add it back if the record is changed.
In our case, there is no such frequent change in database so we just re-make the index daily.

4. Make query
The Lucene query language is easy and simple.

PHP:

require_once('Zend/Search/Lucene.php');
 
##we want the numeric to be searchable as well
Zend_Search_Lucene_Analysis_Analyzer::setDefault(
   new Zend_Search_Lucene_Analysis_Analyzer_Common_TextNum_CaseInsensitive());
 
$index Zend_Search_Lucene::create($zend_indexPath);
$results $index->find("+flower");  
  
echo $index->count()." records found.\n\n";  
  
if($index->count())  
  
{  
  
$count 0;  
  
foreach ($results as $result)  
  
{
  echo "<a href='$result->url'>$result->name </a><br/>";
 
}
}
}

The beauty is: there is no need to use the processing power of MYSQL at all when user makes a query. In fact, the same search is cut from 30s to just 1s. The improvement is amazing.

In fact, we can use this technology to build our own search engine for a large website. Lucene can be used to index html, Excel, PDF, Word documents as well as to index database records.

There is also plugin for you to highlight search results.


Tweet this! Facebook Live Yahoo bookmark Digg US google bookmark

Trackback address for this post:

https://www.sparkcomputing.co.uk/blogs/htsrv/trackback.php?tb_id=64

Trackbacks, Pingbacks:

No Trackbacks/Pingbacks for this post yet...

May 2022
Mon Tue Wed Thu Fri Sat Sun
 << <   > >>
            1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31          

Spark Website Design, Shropshire

Technical articles on IT solutions.

Free information on office and home IT solutions; How to get a website running for your business; self-help e-commerce; and tips on website design.

Search

Misc

XML Feeds

What is this?