Supermind Search Consulting Blog 
Solr - Elasticsearch - Big Data

Posts about programming

Change Redmine homepage to project page

Posted by Kelvin on 15 Oct 2010 | Tagged as: programming

Edit config/routes.rb

Change map.home to look like this:

map.home '', :controller => 'projects', :action => 'show', :id => 'projectname'

How to copy a Subversion directory to another repo preserving revision history

Posted by Kelvin on 15 Oct 2010 | Tagged as: programming

svnadmin dump /svn/old_repos > ./repository.dump
svndumpfilter include path/to/docs path/to/anotherdir --drop-empty-revs --renumber-revs --preserve-revprops < ./repository.dump > ./docs_only.dump
svnadmin load /svn/new_repos < ./docs_only.dump

Courtesy of stackoverflow.

Note that the switches –drop-empty-revs –renumber-revs will change revision numbers. Omit these switches if preserving version numbers is important to you.

How to serve a Subversion repository from Apache

Posted by Kelvin on 14 Oct 2010 | Tagged as: programming

There are alot of tutorials out there on how to get Apache working with Subversion.

I think most coherent instructions can be found at http://csoft.net/docs/svndav.html.en

If you use Dreamhost like I do, and like the easy way in which subversions repos are setup, then you'll like what you see there.

URLizer: a WordPress plugin to automatically linkify URLs

Posted by Kelvin on 12 Oct 2010 | Tagged as: programming, PHP

Am I the only guy using WordPress who is too lazy to type out anchors?

Well, I've been using a WordPress plugin I wrote to automagically linkify URLs for a number of years now, and finally decided to add it to Google Code.

So here it is! http://code.google.com/p/urlizer/

Make Eclipse more like Intellij Idea

Posted by Kelvin on 12 Oct 2010 | Tagged as: programming

http://byteco.de/2010/08/03/making-eclipse-like-idea/ has 2 excellent tips for making Eclipse more like Intellij. 

Most important is Intellij keyboard mappings for Eclipse: http://code.google.com/p/ideakeyscheme/updates/list

Copy to your Eclipse/plugins folder and restart Eclipse. Then change the keyboard scheme to intellij.

Run php from html files on Dreamhost

Posted by Kelvin on 10 Oct 2010 | Tagged as: programming, PHP

Modify .htaccess to include this:

Correct

AddType php-cgi .html .htm

WRONG

AddType application/x-httpd-php .php .htm .html

or

AddHandler application/x-httpd-php .html

Upgrade your HTC droid eris to android 2.2

Posted by Kelvin on 10 Oct 2010 | Tagged as: programming, android

Why bother upgrading? 2 simple reasons: USB and wifi tethering.

Instructions courtesy of my friend Jack:

Step 1) Do a complete backup of your SD card data (just in case)

Step 2) Root your phone
– Go to http://forum.xda-developers.com/showthread.php?t=742228 and follow the instructions

Step 3) Do a Nand backup
– Make sure you have >=500Mb free on your SD card
– With phone off, hold Power + Volume Up to boot into recovery
– Choose the Nand backup option
– Copy the nandroid folder from your SD card to your computer

Step 4) Load the ROM of your choice
– Download a ROM (I recommend http://forum.xda-developers.com/showthread.php?t=745603)
– Follow directions from the ROM page, which generally will include
– Put the ROM in the root dir of your SD card
– Reboot phone into recovery mode (like in step 3)
– Wipe data / factory reset, and wipe Dalvik-cache
– Flash zip from SD card
– Wait for a long time

If you're having difficulty getting to the recovery console like me, try Volume Down + Power instead.

[SOLVED] Howto build the PHP rrdtool extension

Posted by Kelvin on 09 Oct 2010 | Tagged as: programming, Ubuntu, PHP

The definitive answer is here: http://www.samtseng.liho.tw/~samtz/blog/2009/03/11/howto-build-the-php-rrdtool-extension/

If you're on Ubuntu, do this first:

sudo apt-get install rrdtool librrd-dev php5-dev

Then follow the steps above.

[SOLVED] curl: (56) Received problem 2 in the chunky parser

Posted by Kelvin on 09 Oct 2010 | Tagged as: programming, crawling, PHP

The problem is described here:

http://curl.haxx.se/mail/lib-2006-04/0046.html

I successfully tracked the problem to the "Connection:" header. It seems that
if the "Connection: keep-alive" request header is not sent the server will
respond with data which is not chunked . It will still reply with a
"Transfer-Encoding: chunked" response header though.
I don't think this behavior is normal and it is not a cURL problem. I'll
consider the case closed but if somebody wants to make something about it I
can send additional info and test it further.

The workaround is simple: have curl use HTTP version 1.0 instead of 1.1.

In PHP, add this:

curl_setopt($curl_handle, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_0 );

How to write a custom Solr FunctionQuery

Posted by Kelvin on 03 Sep 2010 | Tagged as: programming, Lucene / Solr / Elasticsearch / Nutch

Solr FunctionQueries allow you to modify the ranking of a search query in Solr by applying functions to the results.

There are a list of out-of-box FunctionQueries available here: http://wiki.apache.org/solr/FunctionQuery

In order to write a custom Solr FunctionQuery, you'll need to do 2 things:

1. Subclass org.apache.solr.search.ValueSourceParser. Here's a stub ValueSourceParser.

public class MyValueSourceParser extends ValueSourceParser {
  public void init(NamedList namedList) {
  }
 
  public ValueSource parse(FunctionQParser fqp) throws ParseException {
    return new MyValueSource();
  }
}

2. In solrconfig.xml, register your new ValueSourceParser directly under the <config> tag

<valueSourceParser name="myfunc" class="com.mycompany.MyValueSourceParser" />

3. Subclass org.apache.solr.search.ValueSource and instantiate it in your ValueSourceParser.parse() method.

Lets take a look at 2 ValueSource implementations to see what they do, starting with the simplest:

org.apache.solr.search.function.ConstValueSource

Example SolrQuerySyntax: _val_:1.5

It simply returns a float value.

public class ConstValueSource extends ValueSource {
  final float constant;
 
  public ConstValueSource(float constant) {
    this.constant = constant;
  }
 
  public DocValues getValues(Map context, IndexReader reader) throws IOException {
    return new DocValues() {
      public float floatVal(int doc) {
        return constant;
      }
      public int intVal(int doc) {
        return (int)floatVal(doc);
      }
      public long longVal(int doc) {
        return (long)floatVal(doc);
      }
      public double doubleVal(int doc) {
        return (double)floatVal(doc);
      }
      public String strVal(int doc) {
        return Float.toString(floatVal(doc));
      }
      public String toString(int doc) {
        return description();
      }
    };
  }
// commented out some boilerplate stuff
}

As you can see, the important method is DocValues getValues(Map context, IndexReader reader). The gist of the method is return a DocValues object which returns a value given a document id.

org.apache.solr.search.function.OrdFieldSource

ord(myfield) returns the ordinal of the indexed field value within the indexed list of terms for that field in lucene index order (lexicographically ordered by unicode value), starting at 1. In other words, for a given field, all values are ordered lexicographically; this function then returns the offset of a particular value in that ordering.

Example SolrQuerySyntax: _val_:"ord(myIndexedField)"

public class OrdFieldSource extends ValueSource {
  protected String field;
 
  public OrdFieldSource(String field) {
    this.field = field;
  }
  public DocValues getValues(Map context, IndexReader reader) throws IOException {
    return new StringIndexDocValues(this, reader, field) {
      protected String toTerm(String readableValue) {
        return readableValue;
      }
 
      public float floatVal(int doc) {
        return (float)order[doc];
      }
 
      public int intVal(int doc) {
        return order[doc];
      }
 
      public long longVal(int doc) {
        return (long)order[doc];
      }
 
      public double doubleVal(int doc) {
        return (double)order[doc];
      }
 
      public String strVal(int doc) {
        // the string value of the ordinal, not the string itself
        return Integer.toString(order[doc]);
      }
 
      public String toString(int doc) {
        return description() + '=' + intVal(doc);
      }
    };
  }
}

OrdFieldSource is almost identical to ConstValueSource, the main differences being the returning of the order rather than a const value, and the use of StringIndexDocValues which is for obtaining the order of values.

Our own ValueSource

We now have a pretty good idea what a ValueSource subclass has to do:

return some value for a given doc id.

This can be based on the value of a field in the index (like OrdFieldSource), or nothing to do with the index at all (like ConstValueSource).

Here's one that performs the opposite of MaxFloatFunction/max() – MinFloatFunction/min():

public class MinFloatFunction extends ValueSource {
  protected final ValueSource source;
  protected final float fval;
 
  public MinFloatFunction(ValueSource source, float fval) {
    this.source = source;
    this.fval = fval;
  }
 
  public DocValues getValues(Map context, IndexReader reader) throws IOException {
    final DocValues vals =  source.getValues(context, reader);
    return new DocValues() {
      public float floatVal(int doc) {
	float v = vals.floatVal(doc);
        return v > fval ? fval : v;
      }
      public int intVal(int doc) {
        return (int)floatVal(doc);
      }
      public long longVal(int doc) {
        return (long)floatVal(doc);
      }
      public double doubleVal(int doc) {
        return (double)floatVal(doc);
      }
      public String strVal(int doc) {
        return Float.toString(floatVal(doc));
      }
      public String toString(int doc) {
	return "max(" + vals.toString(doc) + "," + fval + ")";
      }
    };
  }
 
  @Override
  public void createWeight(Map context, Searcher searcher) throws IOException {
    source.createWeight(context, searcher);
  } 
 
// boilerplate methods omitted
}

And the corresponding ValueSourceParser:

public class MinValueSourceParser extends ValueSourceParser {
  public void init(NamedList namedList) {
  }
 
  public ValueSource parse(FunctionQParser fqp) throws ParseException {
        ValueSource source = fp.parseValueSource();
        float val = fp.parseFloat();
        return new MinFloatFunction(source,val);
  }
}

« Previous PageNext Page »