Thoughts on Lucene, Solr and ElasticSearch 

Send response to client in PHP and continue processing

Posted by Kelvin on 03 Feb 2014 | Tagged as: PHP

Here's one way to send and close the connection to the client and for the PHP script to continue processing, presumably to perform some processing that is time-consuming:

<?php
ob_end_clean();
header("Connection: close\r\n");
header("Content-Encoding:

...

Mapping alt-pgup and alt-pgdown to home and end in ubuntu

Posted by Kelvin on 12 Nov 2013 | Tagged as: Ubuntu

On my Lenovo T530 laptop, the PgUp and PgDown keys are right next to the arrow keys, which makes for very smooth code navigation. Unfortunately, the Home and End keys are far away, above the Backspace key to be precise. ...

[SOLVED] gedit Invalid byte sequence in conversion input

Posted by Kelvin on 07 Nov 2013 | Tagged as: Ubuntu

I've been tearing my hair out lately trying to open UTF-8 encoded text files in gedit (Ubuntu 12.04). For some reason, the auto charset detection mechanism is broken. Opening the same files using gvim or leafpad just works. Googling for ...

[solved] Tomcat 6 UTF-8 encoding issue

Posted by Kelvin on 08 Oct 2013 | Tagged as: programming

If after following all the instructions in the Tomcat docs for enabling UTF-8 support (http://wiki.apache.org/tomcat/FAQ/CharacterEncoding#Q8) and you still run into UTF-8 issues, and your webapp involves reading and displaying the contents of files, give this a whirl.

In catalina.sh, either ...

Phrase-based Out-of-order Solr Autocomplete Suggester

Posted by Kelvin on 16 Sep 2013 | Tagged as: Lucene / Solr / Elastic Search / Nutch

Solr has a number of Autocomplete implementations which are great for most purposes. However, a client of mine recently had some fairly specific requirements for autocomplete:

1. phrase-based substring matching
2. out-of-order matches ('foo bar' should match 'the bar ...

Guava Tables

Posted by Kelvin on 13 Sep 2013 | Tagged as: programming

Just discovered Guava's Table data structure. Whoa..!

https://code.google.com/p/guava-libraries/wiki/NewCollectionTypesExplained

Table<Vertex, Vertex, Double> weightedGraph = HashBasedTable.create();
weightedGraph.put(v1, v2, 4);
weightedGraph.put(v1, v3, 20);
weightedGraph.put(v2, v3, 5);

weightedGraph.row(v1); //

...

Custom Solr QueryParsers for fun and profit

Posted by Kelvin on 09 Sep 2013 | Tagged as: Lucene / Solr / Elastic Search / Nutch

In this post, I'll show you what you need to do to implement a custom Solr QueryParser.

Step 1

Extend QParserPlugin.

public class TestQueryParserPlugin extends QParserPlugin {
  public

...

High-level overview of Latent Semantic Analysis / LSA

Posted by Kelvin on 09 Sep 2013 | Tagged as: Lucene / Solr / Elastic Search / Nutch, programming

I've just spent the last couple days wrapping my head around implementing Latent Semantic Analysis, and after wading through a number of research papers and quite a bit of linear algebra, I've finally emerged on the other end, and thought ...

Naive Solr Did You Mean re-searcher SearchComponent

Posted by Kelvin on 05 Sep 2013 | Tagged as: Lucene / Solr / Elastic Search / Nutch

Solr makes Spellcheck easy. Super-easy in fact. All you need to do is to change some stuff in solrconfig.xml, and voila, spellcheck suggestions!

However, that's not how google does spellchecking. What Google does is determine if the query has ...

Reading ElasticSearch server book...

Posted by Kelvin on 23 May 2013 | Tagged as: Lucene / Solr / Elastic Search / Nutch

Just got on my hands on a review copy of PacktPub's ElasticSearch Server book, which I believe is the first ES book on the market.

Review to follow shortly..

Next Page »

10/02/2014 | Kelvin Tan