Supermind Search Consulting Blog 
Solr - Elasticsearch - Big Data

Posts about Ubuntu

Introducing Bash-whacking

Posted by Kelvin on 04 Dec 2010 | Tagged as: Ubuntu

This entry is part 1 of 19 in the Bash-whacking series

Starting a new blog series on bash scripts you shouldn't live without.

Installation for scripts is simple: either add them to /usr/bin or add ~/bin to your bash path and place your scripts there. Don't forget to make the scripts executable!

Here's a complete example:

Suppose you wanted to create a script called foobar and make it accessible from anywhere.

First add create ~/bin, then add it to your bash path.

mkdir ~/bin
gedit ~/.bash_profile

Add this to the end of the file:

export PATH=$PATH:~/bin

Now you need to enter a new bash shell for your settings to take effect (or logout and log back in).

Then, go ahead and create the script:

gedit ~/bin/foobar

Paste this in

echo "Foobar"

Save and exit, then make the script executable.

chmod +x ~/bin/foobar

Now from any directory, you can type foobar and get this back:


Recursive directory listing sorted by file size

Posted by Kelvin on 03 Dec 2010 | Tagged as: Ubuntu

This entry is part 7 of 19 in the Bash-whacking series

Ever wanted to list a directory sorted by decreasing file size? Useful for finding large files..

du -k * | sort -nr | cut -f2 | xargs -d '\n' du -sh

Courtesy of

Resume cancelled/crashed downloads in Google Chrome on Ubuntu/OSX

Posted by Kelvin on 28 Oct 2010 | Tagged as: Ubuntu

If you've ever found yourself downloading a large file in Google Chrome (like the 700MB Ubuntu distro) and had either Chrome or the OS crash, read on to find out how to recover and resume the download using wget

First, go to the folder where Chrome saves your downloads. For me, its ~/Downloads and list all files with .crdownload extension

cd ~/Downloads
ls *.crdownload

In my case, I was trying to download a Slax distro, so ls reported


Now rename the file, removing the .crdownload extension.

mv slax-6.1.2.iso.crdownload slax-6.1.2.iso

Then get wget to continue the download for you!

wget --continue



this works on Windows too! Just rename the file in Windows Explorer and use wget for Windows.

How to wipe your hard drive securely

Posted by Kelvin on 26 Oct 2010 | Tagged as: Ubuntu

Give yourself a pat on the back if you're even thinking of doing this. Its kinda scary what people put on their hard drives, then casually dispose of them without first wiping the data clean.

Some really interesting comments from shred.c, written by Colin Plumb:

* Do a more secure overwrite of given files or devices, to make it harder
* for even very expensive hardware probing to recover the data.
* Although this process is also known as "wiping", I prefer the longer
* name both because I think it is more evocative of what is happening and
* because a longer name conveys a more appropriate sense of deliberateness.
* For the theory behind this, see "Secure Deletion of Data from Magnetic
* and Solid-State Memory", on line at
* Just for the record, reversing one or two passes of disk overwrite
* is not terribly difficult with hardware help. Hook up a good-quality
* digitizing oscilloscope to the output of the head preamplifier and copy
* the high-res digitized data to a computer for some off-line analysis.
* Read the "current" data and average all the pulses together to get an
* "average" pulse on the disk. Subtract this average pulse from all of
* the actual pulses and you can clearly see the "echo" of the previous
* data on the disk.
* Real hard drives have to balance the cost of the media, the head,
* and the read circuitry. They use better-quality media than absolutely
* necessary to limit the cost of the read circuitry. By throwing that
* assumption out, and the assumption that you want the data processed
* as fast as the hard drive can spin, you can do better.

Anyway, for wiping, you have 2 good free options really:

1. boot into Knoppix, Ubuntu LiveCD or some other LiveCD distro and run shred
2. DBAN! (Darik's Boot and Nuke)


Get DBAN here, burn the ISO and boot into it.

Follow instructions, and you now have a securely wiped disc. Congratulations!


If you run into this error:

ISOLINUX 4.00 4.00pre46 ETCD Comyright (C) 1994-2010 H. Peter Anvin et al
reading sectors error(EDD)
ERROR: No configuration file found

then you've most likely downloaded DBAN 2.2.6. If this gives you this problem, try DBAN 1.0.7. Many users have reported success with 1.0.7.


The manpage for shred says:

shred – overwrite a file to hide its contents, and optionally delete it

shred [OPTIONS] FILE […]

Overwrite the specified FILE(s) repeatedly, in order to make it harder
for even very expensive hardware probing to recover the data.

To run shred, boot into a LiveCD distro (Knoppix, Ubuntu, etc), open a shell and run this:

shred -vfz -n 100 /dev/hda

Here shred is making (-n) 100 passes by overwriting the entire hard disk with (-z) zeros. And shred program (-f) forces the write by changing the permissions wherever necessary.

You may need to substitute /dev/hda for your hard drive device name. It'll be something like /dev/hda or /dev/hdb or /dev/sda etc.

If you're not sure, run

>fdisk -l
>Disk /dev/sda: 500.1 GB, 500107862016 bytes

GimpShop – a saner interface for Gimp on Ubuntu

Posted by Kelvin on 10 Oct 2010 | Tagged as: Ubuntu

Gimp's interface sucks. GimpShop offers an interface which should be familiar to Photoshop user, but still uses Gimp in the background.

Installation instructions here

[SOLVED] Howto build the PHP rrdtool extension

Posted by Kelvin on 09 Oct 2010 | Tagged as: programming, Ubuntu, PHP

The definitive answer is here:

If you're on Ubuntu, do this first:

sudo apt-get install rrdtool librrd-dev php5-dev

Then follow the steps above.

[SOLVED] Ubuntu 10.04 Lucid and VMWare Workstation 6.5.4

Posted by Kelvin on 26 May 2010 | Tagged as: programming, Ubuntu

After following a number of unsuccessful links, the definitive solution for installing VMWare Workstation 6.5.x on Ubuntu 10.04:

[SOLVED] Set cd-rom speed on Ubuntu

Posted by Kelvin on 03 May 2010 | Tagged as: programming, Ubuntu

My cdrom is a speedy 48x drive. Unfortunately when it revs up, its often rather loud. Here's how to lower the speed on Ubuntu.

sudo apt-get install setcd
setcd -x 1

Change -x 1 to -x [some number] where the higher the number, the faster the drive.

HOWTO: Persistent DNS Caching on Ubuntu with pdnsd

Posted by Kelvin on 27 Apr 2010 | Tagged as: programming, Ubuntu

sudo apt-get install pdnsd

If prompted, choose "Manual".

sudo gedit /etc/pdnsd.conf

Copy and paste this into the editor.

// Read the pdnsd.conf(5) manpage for an explanation of the options.

/* Note: this file is overriden by automatic config files when
   /etc/default/pdnsd AUTO_MODE is set and that
   /usr/share/pdnsd/pdnsd-$AUTO_MODE.conf exists

global {
	server_ip =;  // Use eth0 here if you want to allow other
				// machines on your network to query pdnsd.
	status_ctl = on;
//	query_method=tcp_udp;	// pdnsd must be compiled with tcp
				// query support for this to work.
	min_ttl=96h;       // Retain cached entries at least 15 minutes.
	max_ttl=2w;	   // One week.
	timeout=10;        // Global timeout option (10 seconds).
        // Don't enable if you don't recurse yourself, can lead to problems
        // delegation_only="com","net";

server {
	label="OpenDNS Plus";
	timeout = 5;
	uptest = query;
	interval = 30m;      // Test every half hour.
	ping_timeout = 300;  // 30 seconds.
	purge_cache = off;
	exclude = .localdomain;
	policy = included;
	preset = off;

source {
//	serve_aliases=on;

rr {

Now edit /etc/default/pdnsd

sudo gedit /etc/default/pdnsd





This disables AUTO_MODE and gets pdnsd to use our /etc/pdnsd.conf file.

Now edit the dhclient.conf file.

sudo gedit /etc/dhcp3/dhclient.conf


#prepend domain-name-servers;


prepend domain-name-servers;

(delete the # from the start of the line). Save and exit.

sudo /etc/init.d/pdnsd restart

Test out the DNS cache like so


Check that the SERVER line shows This means you’re pointed at your local cache.

Now, if you run that command again:


You should see something like Query time: 0 msec.

Mapping neighborhoods to street addresses via geocoding

Posted by Kelvin on 19 Apr 2010 | Tagged as: Ubuntu, programming, Lucene / Solr / Elasticsearch / Nutch

As far as I know, none of the geocoders consistently provide neighborhood data given a street address. Useful information when consulting the gods at google proves elusive too.

Here's a step-by-step guide to obtaining neighborhood names for your street addresses (on Ubuntu).

0. Geocode your addresses if necessary using Yahoo, MapQuest or Google geocoders. (this means converting addresses into latitude and longitude).

1. Install PostGIS.

sudo apt-get install postgresql-8.3-postgis

2. Complete the postgis install

sudo -u postgres createdb mydb
sudo -u postgres createlang plpgsql mydb
cd /usr/share/postgresql-8.3-postgis/
sudo -u postgres psql -d mydb -f lwpostgis.sql
sudo -u postgres psql -d mydb -f spatial_ref_sys.sql

3. Download and import Zillow neighborhood data. For this example, we'll be using California data.

cd /tmp
shp2pgsql ZillowNeighborhoods-CA public.neighborhood > ca.sql
sudo -u postgres psql -d mydb -f ca.sql

4. Connect to psql and run a query.

sudo -u postgres psql -d mydb
select name,city from public.neighborhood where ST_Within(makepoint(-122.4773980,37.7871760), the_geom)=true ;

If you've done everything right, this should be returned from the SQL:

name | city
Inner Richmond | San Francisco
(1 row)


« Previous PageNext Page »