Average length of a URL
Posted by Kelvin on 06 Nov 2009 at 06:48 pm | Tagged as: crawling, Lucene / Solr / Elastic Search / Nutch, programming
Aug 16 update: I ran a more comprehensive analysis with a more complete dataset. Find out the new figures for the average length of a URL
I've always been curious what the average length of a URL is, mostly when approximating memory requirements of storing URLs in RAM.
Well, I did a dump of the DMOZ URLs, sorted and uniq-ed the list of URLs.
Ended up with 4074300 unique URLs weighing in at 139406406 bytes, which approximates to 34 characters per URL.
-
Brodie
-
Luying
-
Kelvin
-
http://www.supermind.org/blog/740/average-length-of-a-url-part-2 Average length of a URL (Part 2) :: Kelvin Tan - Lucene Solr Nutch Consultant
-
http://antipaucity.com Warren
-
http://hereisafantasy.com/2011/10/16/how-much-money-do-i-need-to-buy-the-internet/ How Much Money Do I Need to Buy the Internet? « here is a fantasy
