Thoughts on Lucene, Solr, crawling and vertical search 

Painless CRUD in PHP via AjaxCrud

Posted by Kelvin on 08 Oct 2011 | Tagged as: PHP, programming

I recently discovered an Ajax CRUD library which makes CRUD operations positively painless: AjaxCRUD

Its features include:

- displaying list in an inline-editable table
- generates a create form
- all operations (add,edit,delete) handled via ajax

What's new in Solr 3.4.0

Posted by Kelvin on 06 Oct 2011 | Tagged as: Lucene / Solr / Elastic Search / Nutch

If you are already using Apache Solr 3.1, 3.2 or 3.3, it's strongly recommended you upgrade to 3.4.0 because of the index corruption bug on OS or computer crash or power loss (LUCENE-3418), now fixed in 3.4.0.

Solr 3.4.0 release …

Introducing SolrTutorial.com

Posted by Kelvin on 02 Oct 2011 | Tagged as: Lucene / Solr / Elastic Search / Nutch

Just launched a Solr tutorial website, a site styled after my LuceneTutorial.com but tailored towards Solr users.

It also includes high-level overviews to Solr for non-programmers, such as Solr for Managers and Solr for SysAdmins.

Delete directories older than x days

Posted by Kelvin on 04 Aug 2011 | Tagged as: Ubuntu

Great for cleaning up log directories.

find . -maxdepth 1 -mtime +14 -type d -exec rm -fr {} \;
 

Change 14 to the required age in days.

HOWTO: Collect WebDriver HTTP Request and Response Headers

Posted by Kelvin on 22 Jun 2011 | Tagged as: crawling, Lucene / Solr / Elastic Search / Nutch, programming

WebDriver, is a fantastic Java API for web application testing. It has recently been merged into the Selenium project to provide a friendlier API for programmatic simulation of web browser actions. Its unique property is that of executing web pages …

Solr 3.2 released!

Posted by Kelvin on 22 Jun 2011 | Tagged as: crawling, Lucene / Solr / Elastic Search / Nutch, programming

I'm a little slow off the block here, but I just wanted to mention that Solr 3.2 had been released!

Get your download here: http://www.apache.org/dyn/closer.cgi/lucene/solr

Solr 3.2 release highlights include

  • Ability to specify overwrite and commitWithin


Classical learning curves for some editors

Posted by Kelvin on 20 Jun 2011 | Tagged as: programming

PHP function to send an email with file attachment

Posted by Kelvin on 11 Jun 2011 | Tagged as: PHP, programming

Courtesy of http://www.finalwebsites.com/forums/topic/php-e-mail-attachment-script

function mail_attachment($filename, $path, $mailto, $from_mail, $from_name, $replyto, $subject, $message) {
    $file = $path.$filename;
    $file_size


Determine if a server supports Gzip compression

Posted by Kelvin on 06 Jun 2011 | Tagged as: Ubuntu

echo "Size WITHOUT accepting gzip"
curl http://www.supermind.org –silent –write-out "size_download=%{size_download}\n" –output /dev/null
echo "Size WITH accepting gzip"
curl http://www.supermind.org –silent -H "Accept-Encoding: gzip,deflate"  –write-out "size_download=%{size_download}\n" –output


HOWTO: Add gzip support to Squid 3.1 in Ubuntu

Posted by Kelvin on 06 Jun 2011 | Tagged as: Ubuntu

The squid3 deb that's available in the apt repos don't come configured with ecap support, which is required to support serving of gzip-compressed pages to clients.

In a network environment where the majority of traffic is wireless (like where …

« Previous PageNext Page »

05/19/2012 | Kelvin Tan | Lucene Solr Crawl Consultant