Thoughts on Lucene, Solr, Nutch and vertical search 

programming

Archived Posts from this Category

MySQL Large ResultSet java.lang.OutOfMemoryError Workaround

Posted by Kelvin on 04 Mar 2010 | Tagged as: programming

Ever tried to fetch all rows from a large MySQL table?

You’re bound to hit up against a java.lang.OutOfMemoryError.

So you try searching for how to set the fetchSize, and of course, it doesn’t quite work.

The MySQL JDBC Driver implementation notes states:

By default, ResultSets are completely retrieved and stored in memory. In most cases this is the most efficient way to operate, and due to the design of the MySQL network protocol is easier to implement. If you are working with ResultSets that have a large number of rows or large values, and can not allocate heap space in your JVM for the memory required, you can tell the driver to stream the results back one row at a time.

To enable this functionality, you need to create a Statement instance in the following manner:

stmt = conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY,
java.sql.ResultSet.CONCUR_READ_ONLY);
stmt.setFetchSize(Integer.MIN_VALUE);

The combination of a forward-only, read-only result set, with a fetch size of Integer.MIN_VALUE serves as a signal to the driver to stream result sets row-by-row. After this any result sets created with the statement will be retrieved row-by-row.

WHAT?! That’s pretty silly. Well turns out there are 2 alternatives to the ludicrous way of stepping through row-by-row.

1. from 5.0.2 onwards mysql supports server-side cursor by setting the property • useCursorFetch=true and defaultFetchSize.
This means that all rows wil not be pushed to client . Only defaultFetchSize number of rows will be send to client at a time.

2. If you’re selecting the rows to update them, AND the criteria you’re selecting on IS what you’re updating, then there’s an uber-simple workaround:

a Add a limit to the SQL (e.g. limit 1000)
b. Instantiate an updateable statement
c. Execute the SQL, updating the rows as you iterate
d. Repeat until there are no more results

QED

Writing user documentation in style with Sphinx

Posted by Kelvin on 03 Mar 2010 | Tagged as: programming

Adventures in Java Doclets

Posted by Kelvin on 03 Mar 2010 | Tagged as: programming

Ubuntu 9.04 + Engenius EUB-362 EXT [SOLVED]

Posted by Kelvin on 10 Feb 2010 | Tagged as: programming

Friendlier Thunderbird date columns

Posted by Kelvin on 24 Jan 2010 | Tagged as: programming

dom4j.org - WTF?

Posted by Kelvin on 22 Jan 2010 | Tagged as: programming

CSS3 Selectors in Java

Posted by Kelvin on 21 Jan 2010 | Tagged as: programming

Dom4j + XPath + TagSoup - Namespaces = sweet!

Posted by Kelvin on 20 Jan 2010 | Tagged as: programming

Using expressions to assign PHP static variables

Posted by Kelvin on 14 Jan 2010 | Tagged as: programming

Handling single query multiple ResultSets in MySQL and JDBC

Posted by Kelvin on 14 Jan 2010 | Tagged as: programming

Next Page »