Supermind Search Consulting Blog 
Solr - ElasticSearch - Big Data

Linode has terrible customer service

Posted by Kelvin on 01 May 2016 | Tagged as: programming

I recently ran a small webapp on Linode to test out response latency from their California datacenter. When I no longer needed the app, I powered down the node, thinking that the hourly billing as advertised on this page: would mean that I wouldn't get charged for a powered down node. Unfortunately, I didn't […]

Power browsing proggit + HN + + dzone news

Posted by Kelvin on 20 Jan 2016 | Tagged as: programming

Disclaimer: this uses Erudite, a tool I wrote in Django. Here's how I speed-read programming-related news. Open in your browser. Press ` (backtick key) to page the entire row, shift+` to page-prev the entire row. Press 1 2 3 4 to page each respective column. shift+1 to previous page on column 1, shift+2 on […]

Erudite – a text-only, keyboard-friendly news reader

Posted by Kelvin on 12 Jan 2016 | Tagged as: programming

Something I've been working on for a bit: A keyboard-friendly, text-only news reader. Somewhat mobile-friendly. Hit '?' for keyboard shortcuts.

Embed custom Javascript and HTML in a Kibana 4.x visualization

Posted by Kelvin on 11 Jan 2016 | Tagged as: Lucene / Solr / Elastic Search / Nutch

The embarrassingly simple answer to embedding ANY Javascript and HTML into a Kibana vis is to hack the markdown_vis plugin to not use markdown at all, but just display the HTML as-is. Modify src/plugins/markdown_vis/public/markdown_vis_controller.js, and comment out $scope.html = $sce.trustAsHtml(marked(html)); and replace it with $scope.html = $sce.trustAsHtml(html); You'll need to recreate the bundles (just install […]

Lucene 5 NRT Example

Posted by Kelvin on 16 Dec 2015 | Tagged as: Lucene / Solr / Elastic Search / Nutch

I just added an NRT search example for Lucene 5.x to Check it out here:

Pain-free Solr replication

Posted by Kelvin on 02 Dec 2015 | Tagged as: Lucene / Solr / Elastic Search / Nutch

Here's a setup I use for totally pain-free Solr replication, and allowing you to switch masters/slaves quickly without messing with config files. Add this to solrconfig.xml <requestHandler name="/replication" class="solr.ReplicationHandler" >   <str name="maxNumberOfBackups">1</str>   <lst name="master">         <str name="enable">${enable.master:false}</str>         <str name="replicateAfter">startup</str>         <str name="replicateAfter">commit</str>   […]

[SOLVED] Frequent disconnects on Ubuntu 12.04 iwlwifi Centrino 2200

Posted by Kelvin on 20 Nov 2015 | Tagged as: Ubuntu

On certain wireless routers, I was getting the dreaded "wlan0: deauthenticating from … by local choice", resulting in constant disconnects (every 30 seconds or less). I tried a whole bunch of options (disabling 11n, disabling hw scanning etc) and the only thing that eventually worked was disabling ipv6. sudo gedit /etc/sysctl.conf #Add these lines at […]

Monier-Williams Sanskrit-English-IAST search engine

Posted by Kelvin on 17 Sep 2015 | Tagged as: Lucene / Solr / Elastic Search / Nutch, programming, Python

I just launched a search application for the Monier-Williams dictionary, which is the definitive Sanskrit-English dictionary. See it in action here: The app is built in Python and uses the Whoosh search engine. I chose Whoosh instead of Solr or ElasticSearch because I wanted to try building a search app which didn't depend on […]

A HTML5 ElasticSearch Query DSL Builder

Posted by Kelvin on 16 Sep 2015 | Tagged as: Lucene / Solr / Elastic Search / Nutch, programming

Tl;DR : I parsed ElasticSearch source and generated a HTML app that allows you to build ElasticSearch queries using its JSON Query DSL. You can see it in action here: I really like ElasticSearch's JSON-based Query DSL – it lets you create fairly complex search queries in a relatively painless fashion. I do not, […]

Properly unit testing scrapy spiders

Posted by Kelvin on 20 Nov 2014 | Tagged as: crawling, Python

Scrapy, being based on Twisted, introduces an incredible host of obstacles to easily and efficiently writing self-contained unit tests: 1. You can't call multiple times 2. You can't stop the reactor multiple times, so you can't blindly call "crawler.signals.connect(reactor.stop, signal=signals.spider_closed)" 3. Reactor runs in its own thread, so your failed assertions won't make it […]

Next Page »