Supermind Search Consulting Blog 
Solr - Elasticsearch - Big Data

Posts about Ubuntu

Using sed to delete lines from a file

Posted by Kelvin on 21 May 2011 | Tagged as: Ubuntu

This entry is part 14 of 19 in the Bash-whacking series

Delete line containing foo

sed -i '/foo/d' filename.txt

Delete last line

sed -i '$d' filename.txt

Recursively find the n latest modified files in a directory

Posted by Kelvin on 18 May 2011 | Tagged as: Ubuntu, programming

This entry is part 13 of 19 in the Bash-whacking series

Here's how to find the latest modified files in a directory. Particularly useful when you've made some changes and can't remember what!

find . -type f -printf '%T@ %p\n' | sort -n | tail -1 | cut -f2- -d" "

Replace tail -1 with tail -20 to list the 20 most recent files for example.

Courtesy of StackOverflow:

Convert fixed-width file to CSV

Posted by Kelvin on 12 May 2011 | Tagged as: programming, Ubuntu

This entry is part 12 of 19 in the Bash-whacking series

After trying various sed/awk recipes to convert from fixed-width to CSV, I found a Python script that works well.

Here it is, from

## {{{ (r1)
# Ian Maurer 
# Convert a Fixed Width file to a CSV with Headers
# Requires following format:
# header1      header2 header3
# ------------ ------- ----------------
# data_a1      data_a2 data_a3
def writerow(ofile, row):
    for i in range(len(row)):
        row[i] = '"' + row[i].replace('"', '') + '"'
    data = ",".join(row)
def convert(ifile, ofile):
    header = ifile.readline().strip()
    while not header:
        header = ifile.readline().strip()
    hticks = ifile.readline().strip()
    csizes = [len(cticks) for cticks in hticks.split()]
    line = header
    while line:
        start, row = 0, []
        for csize in csizes:
            column = line[start:start+csize].strip()
            start = start + csize + 1
        writerow(ofile, row)
        line = ifile.readline().strip()
if __name__ == "__main__":
    import sys
    if len(sys.argv) == 3:
        ifile = open(sys.argv[1], "r")
        ofile = open(sys.argv[2], "w+")
        convert(ifile, ofile)
        print "Usage: python <input> <output>"
## end of }}}

MD5 a directory recursively

Posted by Kelvin on 05 May 2011 | Tagged as: Ubuntu

This entry is part 11 of 19 in the Bash-whacking series

Ever need to check if a directory is exactly the same as another (including file contents)?

find . -type f -exec md5sum {} + | awk '{print $1}' | sort | md5sum

This runs md5sum on the individual md5sum hashes of each file.

And if you need to exclude a directory from the comparison:

find . -type f -exec md5sum {} + | grep -v dirtoexclude | awk '{print $1}' | sort | md5sum

[solved] checking for shout-config… no while compiling ices

Posted by Kelvin on 05 Apr 2011 | Tagged as: Ubuntu

If you're trying to compile ices and get this error:

checking for pkg-config... /usr/bin/pkg-config 
configure: /usr/bin/pkg-config couldn't find libshout. Try adjusting PKG_CONFIG_PATH. 
checking for shout-config... no 
configure: error: Could not find a usable libshout

And you swear you've already installed libshout and libshout-devel, then you need to install libtheora and libtheora-devel. Yes, the error message is misleading.

Improving EasyHotSpot usability

Posted by Kelvin on 03 Apr 2011 | Tagged as: Ubuntu

This entry is part 5 of 5 in the Ubuntu Hotspot with Daily Per-User Quotas series

Here are some changes I made to make EasyHotSpot more usable.. If you are interested in any of these changes, just drop me a mail and I'll email them to you.

1. Allow voucher generation to accept usernames instead of just numberofvouchers
2. Integration of SquidGuard for blocking ads, trackers, etc

In the future, I also hope to add:

1. Allowing users to change their own password
2. Informing users of their quota status (how many MBs they have left)
3. MAC address passthrough for authorized users

Modifying EasyHotSpot 0.2 for per-user daily bandwidth quotas

Posted by Kelvin on 03 Apr 2011 | Tagged as: Ubuntu

This entry is part 4 of 5 in the Ubuntu Hotspot with Daily Per-User Quotas series

First of all, I'll say this – if its at all possible to install the Ubuntu distro of EasyHotSpot (available from the EasyHotSpot download page), do so!

I couldn't because I couldn't get Ubuntu 10.04 installed on my antique laptop which I was going to use as the internet gateway. Only Ubuntu 10.10 worked. I therefore had to install Chillispot, FreeRadius etc and configure them separately, which was a real pain.

Secondly, if you read the documentation for EasyHotSpot, nowhere is there any mention of support for per-user bandwidth quotas. 🙂

Well, my realization was that, since all the accounting is handled in MySQL, all I needed to do to simulate daily quotas was to setup a cron job which runs daily at midnight which clears out the relevant tables which contains the bandwidth usage data. I'll be providing that script later on…

So, I'm assuming you've already read through the EasyHotSpot 0.2 PDF user guide and followed its instructions. Here are my comments on the installation process.

DNS in /etc/chilli.conf matters

I found out that my clients couldn't get an IP address when the DNS servers in /etc/chilli.conf weren't accurate.

uamallowed in /etc/chilli.conf

If your clients are getting an IP Address but not redirecting to the captive portal login page, then check uamallowed, that the relevant subnets have been added. To be safe, I added both tunnel and LAN subnets.


dnsmasq is also a DHCP server!

I think I may have had to disable the DHCP servers on the interfaces that Chillispot was running on.

In /etc/dnsmasq.conf, add this:


sqlcounter max_all_mb patch

In EasyHotSpot docs, it tells you to add this to /etc/freeradius/sql/mysql/counter.conf

sqlcounter max_all_mb {
counter-name = Max-All-MB
check-name = Max-All-MB
reply-name = ChilliSpot-Max-Total-Octets
sqlmod-inst = sql
key = User-Name
reset = never
query = "SELECT SUM(AcctInputOctets)/(1024*1024) + SUM(AcctOutputOctets)/ (1024*1024) FROM radacct WHERE UserName='%{%k}'"

This, however, somehow didn't work for me. I got some "2043939944 is not an octet" error in FreeRadius. The problem was that the query converts acctinputoctets and acctoutputoctets to megabytes, but FreeRadius was expecting bytes. This is what I changed it to:

sqlcounter max_all_mb {
counter-name = Max-All-MB
check-name = Max-All-MB
reply-name = ChilliSpot-Max-Total-Octets
sqlmod-inst = sql
key = User-Name
reset = never
#query = "SELECT SUM(AcctInputOctets)/(1024*1024) + SUM(AcctOutputOctets)/ (1024*1024) FROM radacct WHERE UserName='%{%k}'"
query = "SELECT SUM(AcctInputOctets) + SUM(AcctOutputOctets) FROM radacct WHERE UserName='%{%k}'"

/etc/freeradius/sites-available/default changes

In addition to changes recommended by EasyHotSpot, I had to comment out the following:

– chap authentication

	#  The chap module will set 'Auth-Type := CHAP' if we are
	#  handling a CHAP request and Auth-Type has not already been set
#	chap

– radutmp session database
FreeRadius was giving me a stupid "Access Defined – this user is already logged-in" error. Commenting out radutmp fixed this.

#  or rlm_sql module can handle this.
#  The rlm_sql module is *much* faster
session {
#	radutmp


If you're logged-in successfully but can't get access to the internet, you most likely need to add iptables forwarding rules. Here's mine.

# squid
$IPTABLES -t nat -A PREROUTING -i tun0 -p tcp --dport 80 -j REDIRECT --to 3128
$IPTABLES -t nat -A PREROUTING -i eth0 -p tcp --dport 80 -j REDIRECT --to 3128
$IPTABLES -I FORWARD 1 -i tun0 -p tcp --dport 443 -m conntrack --ctstate NEW -j LOG --log-prefix HOTSPOT:
$IPTABLES -I FORWARD 1 -i eth0 -p tcp --dport 443 -m conntrack --ctstate NEW -j LOG --log-prefix HOTSPOT:
#Enable NAT on output device

Transparent Squid3 proxy

If you set this up correctly, users won't have to change their browser setting to use your squid proxy. The proxying will be "transparent" to them.

Here's my squid.conf.
Note the use of url_rewrite to squidguard. You can comment that out if you don't need it.

http_port 3128 transparent
hierarchy_stoplist cgi-bin ?
acl QUERY urlpath_regex cgi-bin \?
cache deny QUERY
acl apache rep_header Server ^Apache
access_log /var/log/squid3/access.log squid
hosts_file /etc/hosts
refresh_pattern ^ftp: 1440 20% 10080
refresh_pattern ^gopher: 1440 0% 1440
refresh_pattern . 0 20% 4320
# newer Squid's don't need "all", it's built in:
#acl all src all
url_rewrite_program /usr/bin/squidGuard -c /etc/squid/squidGuard.conf
# 10000MB max cache size (default is 100MB):
cache_dir ufs /var/spool/squid3 10000 16 256
http_access allow all
http_reply_access allow all
icp_access allow all
always_direct allow all
coredump_dir /var/spool/squid3

Daily quota reset with shell script

Here's the shell script I'm using to reset quotas. Additionally, I'm saving the bandwidth usages to a separate table (radacct_totals) so I have historical usage.

#turn off free radius
/etc/init.d/freeradius stop
#update radacct_totals
echo 'INSERT IGNORE INTO radacct_totals(username, upload, download, DATE) SELECT username, acctinputoctets AS uploads, acctoutputoctets AS downloads, DATE(now()) FROM radacct GROUP BY username' | mysql -u rad radius -B
#truncate radacct_totals
echo 'truncate radacct' | mysql -u rad radius -B 
#start free radius
/etc/init.d/freeradius start

I don't have the schema DDL of radacct_totals handy, but as you can see, its a pretty simple table, with compound primary key on username and date.

You'll need to add this as a cronjob, daily at 12 midnight.

EasyHotSpot – what it is and isn't

Posted by Kelvin on 03 Apr 2011 | Tagged as: Ubuntu

This entry is part 3 of 5 in the Ubuntu Hotspot with Daily Per-User Quotas series

EasyHotSpot is an open-source hotspot solution built in PHP/MySQL on the CodeIgniter framework. It integrates with Chillispot and FreeRadius to provide a captive portal solution.

Per-user bandwidth quotas are provided by way of "vouchers". A voucher is something you generate according to a "plan", e.g. 10MB voucher which expires 30 days from first-use. You can setup time-based quotas if that's more relevant to you.

You can also setup "pre-paid" accounts from which invoices are generated.

The downsides of using EasyHotSpot are:

– relatively immature software (its at version 0.2, and does have a number of bugs)
– doesn't handle daily quotas out-of-box
– really difficult to configure if you try to install it from an existing Ubuntu installation (fortunately there's an Ubuntu distro which bundles EasyHotSpot)
– not for the faint of heart, and you'll need significant linux chops to pull it off

However, having said that, alot of the potential downsides also applies to most Chillispot-based implementations.

Basic Network Architecture

Here's the bare minimum you'll need to get an EasyHotSpot solution up and running

1. Internet connection (say, cable modem or satellite modem)
2. Linux computer with 2 network cards (one of them can be wireless). I chose to purchase an Asix USB ethernet card.
3. Ubuntu 10.04 or greater

Sequence of events in an EasyHotSpot implementation

Sequence of events of a client login

  1. Client connects to wireless/wired network, requesting for an IP Address using DHCP
  2. Chillispot (which provides DHCP services) grants them an IP address via a "tunnel" it sets up
  3. Client requests for a URL, say
  4. Chillispot checks the client's MAC address to see if they're authenticated. If they're not, they get redirected to a login page
  5. Client logs into Chillispot
  6. Chillispot checks with FreeRadius to see if the user's credentials are accepted (and if they've exceeded their quotas)
  7. FreeRadius responds either negatively (wrong username/password or quota exceeded), or positively (accepted, starting time/bandwidth logging now)
  8. If positive response, Chillisoft redirects user to a "Thank You, you are now connected to the internet" page.

Sequence of events AFTER a client login

After a client has obtained an IP address, here's what needs to happen to successfully serve a HTTP request.

  1. Client attempts to resolve DNS, say of
  2. Dnsmasq, a caching DNS server, responds with IP address of the domain
  3. Client makes HTTP request
  4. iptables-based NAT routing intercept the request and forwards to Squid proxy transparently
  5. Response is returned to the client

Components of EasyHotSpot

I'm now going to try to give you an uber-high-level view of what the different moving parts are (and there are a number of them indeed). Hopefully this will give you a conceptual map to successfully navigate an EasyHotSpot installation.


Chillispot is a Linux package that provides the following:
– integration with FreeRadius
– captive portal
– network bridge between "in" (LAN) and "out" (WAN) interfaces


Performs the user authentication and bandwidth/time accounting.


Caching DNS server


Caching proxy server


Used by EasyHotSpot to store user information. Used by FreeRadius to store user credentials and accounting information.

EasyHotSpot proper

EasyHotSpot itself is a PHP web application which provides a web interface to managing users, vouchers, etc. It then writes to a MySQL database which FreeRadius uses.

The interface between EasyHotSpot and the rest of the system, therefore is EasyHotSpot. It is also the piece of the puzzle you install LAST. By the time you get to the EasyHotSpot bit of the installation, most of the hard work is done.

And next…

In my next post, I'll talk about what I did to get EasyHotSpot working just the way I needed it to.

Features for the Captive Portal

Posted by Kelvin on 03 Apr 2011 | Tagged as: Ubuntu

This entry is part 2 of 5 in the Ubuntu Hotspot with Daily Per-User Quotas series

In a previous post, I talked about my quest for a captive portal which supports per-user download quotas (and also explain what the heck a captive portal is).

Here's a list of features which I was looking for in the Captive Portal:

  1. Free!
  2. Web-based administration
  3. Users need to login before having access to the internet
  4. Volume-based accounting – as opposed to time-based accounting. User quotas are determined by the amount of bandwidth they consume.
  5. Daily (or some other time period) reset of quotas
  6. When the quota is consumed, internet access is blocked till the quota has been reset.
  7. Support for DNS and HTTP caching

Existing options

I surveyed a number of the router firmware/linux distro options, including DD-WRT, OpenWRT, Gargoyle, Tomato, pfSense, IPCop, Untangle, etc. None could do the job out-of-box, or suggested a relatively straightforward pathway to implementation.

Most offered some kind of QoS bandwidth throttling, and or limited quota system, but not all the features I needed in one.

The only out-of-box software that seemed to be a perfect fit was the popular Windows hotspot program FirstSpot. It is however prohibitively expensive.

After alot of research, I finally decided to go with an open-source PHP/MySQL-based hotspot solution called EasyHotSpot.

In my next post, I'll talk about EasyHotSpot – what is it is, what it does for you, and what it doesn't.

Customized Ubuntu Captive Portal solution with per-user bandwidth quotas

Posted by Kelvin on 03 Apr 2011 | Tagged as: Ubuntu

This entry is part 1 of 5 in the Ubuntu Hotspot with Daily Per-User Quotas series

There are a number of free hotspot/captive portal solutions available on Linux, but believe it or not, not a single one of them offers daily per-user upload/download quotas.

I ended up going with a customized EasyHotSpot solution which is based off the defunct Chillispot, FreeRadius and MySQL.

Its way more involved than I initially thought. At the beginning, I was like, how difficult can it be? Well, turns out, pretty difficult.

This is a series documenting the epic quest for the ultimate free Captive Portal with daily per-user quotas. 🙂

First, before going too far, here's a glossary of jargon you're probably going to have to swallow when entering this world:

  • Captive Portal solution – software that redirects users to a login page before granting them access to the internet.
  • HotSpot/Access Point – generic term for a wireless solution that offers internet access to a number of users
  • Radius,FreeRadius,etc – a popular authentication/authorization/accounting framework and software used by a number of captive portal providers
  • QoS – Quality Of Service. A mechanism for prioritizing network traffic.
  • throttling – Often used with the term QoS. It means limiting bandwidth speeds, measured in kbps (kilobits per second) or KBps (kilobytes per second). This is NOT the same as bandwidth quotas.
  • per-user bandwidth quotas – Limits on the amount that can be downloaded by a user. Has to provide some mechanism of disconnecting the user when the quota has been used up

More to come in the next post on the features I was looking for in the Captive Portal solution..

« Previous PageNext Page »