Supermind Search Consulting Blog 
Solr - Elasticsearch - Big Data

Posts about life

RT: Larry is furious about this Mark Hurd thing

Posted by Kelvin on 27 Aug 2010 | Tagged as: life

This is absolutely priceless:

Honestly, he won’t let it go. He’s calling me over and over saying, Wait until you hear the latest, you won’t believe what they’re saying now! As if I care. Jesus. I put my iPhone 4 down on the desk and let him rant for a few minutes while I do some work, then I pick up and pretend I’ve been listening. Larry’s position is that Hurd didn’t do anything wrong. He’s like, Look, it’s not like he was drugging teenage girls and raping them while they were passed out! I’m like, Wait, is that the new hurdle CEOs have to get over? As long as you’re not feeding them roofies and raping them, it’s okay?

http://www.fakesteve.net/2010/08/larry-is-furious-about-this-mark-hurd-thing.html

DHTML toolkits

Posted by Kelvin on 18 Jan 2005 | Tagged as: life, programming

Link log of DHTML toolkits. Most promising seems to be nwidgets/netwindows. Articles about it can be found here and here.

http://koranteng.blogspot.com/2004/07/on-rich-web-applications-alphablox-and.html is interesting read about building advanced web interfaces, http://scottandrew.com/weblog/dhtmllibs has annotated list of dhtml libs, http://www.dithered.com/javascript/index.html has some interesting libs,.

Robin Good on interaction design

Posted by Kelvin on 12 Jan 2005 | Tagged as: life

http://www.masternewmedia.org/ is a wonderful resource for interaction design and communicating ideas. Some gems include:

How To Select Perfectly Matching Color Combinations
http://www.masternewmedia.org/2003/04/30/how_to_select_perfectly_matching_color_combinations.htm

GUI Bible: Application Interface Design Fundamentals For Software Developers
http://www.masternewmedia.org/news/2004/12/04/gui_bible_application_interface_design.htm

Personal Knowledge Mapping And The Concept Of Data Emergence
http://www.masternewmedia.org/2003/11/28/personal_knowledge_mapping_and_the.htm

Patterns in unstructured data

Posted by Kelvin on 07 Jan 2005 | Tagged as: life

http://javelina.cet.middlebury.edu/lsa/out/cover_page.htm is a very good presentation about Latent Semantic Indexing.

Requirements of an Enterprise DMS

Posted by Kelvin on 05 Jan 2005 | Tagged as: life

Network computing has an article on enterprise document management systems. Quite interesting as to what is expected of an enterprise DMS.

http://www.nwc.com/showitem.jhtml?docid=1505f1 is also interesting.

Practical aspects of KM for document management

Posted by Kelvin on 05 Jan 2005 | Tagged as: life

Lotsa FUD introduced by the Semantic Web initiative, KM hype etc. But which are the practical ones in terms of document management?

3 come to mind: hierarchical classification of docs, a thesaurus where terms are defined and relationships between documents (and people).

Hierarchical classification: basically the Windows Explorer model where documents are placed into folders. A variant of this is faceted classification. A faceted classification system is one in which each document is classified according to several separate hierarchical classification systems, called facets.

Thesaurus: describe synonyms, preferred/obsolete terms, broader/narrower term relationship. A thesaurus defines a convention when describing document metadata. Its in my opinion that its too ambitious (hence unpractical) to apply the thesarus to document contents as well.

Relationships: many names for this, like document collections, related documents, etc. A relationships consists of: players, the roles played by the players, and the relationships between the roles/players. This can be a binary or an n-ary association.

Did I miss anything out?

It's all about communication

Posted by Kelvin on 05 Jan 2005 | Tagged as: life

I cannot seriously convince myself that people are interested in managing anything other than their personal documents. When people say document management, what they're really interested in is communication.

People communicate to create, people create to communicate. Think about the top 3 software you use everyday. My bet is 2 of them is really for communication. Email. Weblogs. Wikis. All about communication.

With that in mind, a document management application is really a piece in the puzzle to improve communication between people. Between employees, employers to employees, companies to clients, etc.
That is the main reason why I've decided to integrate a wiki into a document management system. By making collaborative editing painless, a wiki makes persistent and publishable communication easy. A power wiki like http://www.twiki.org arguably already provides basic content/document management functionality, but there's alot alot more to be accounted for in this space that cannot be addressed by a simple wiki.

Another implication of the emphasis on communication, is really that its about bringing people closer together. Its not only about people, its about the relationships between them. But that seems highly theoretical to me, and I’m still finding a way of practically expressing that in software.

Project blog-log

Posted by Kelvin on 05 Jan 2005 | Tagged as: life

Concept
Beyond document management. Practical knowledge management. Integration of a wiki with a document management system.

Initial parts of the application

  • access control
  • document pipeline
  • search
  • categorization
  • relationships
  • archival
  • blob storage
  • namespace
  • routing/workflow
  • version
  • lock
  • publish/export
  • import
  • convert
  • statistics
  • audit
  • preferences

Notes

  1. SVN will be used as backend for blob storage and versioning of binary data. The primary motivation of this is the availability of the TortoiseSVN project which provides Windows Explorer integration. Now this is a killer feature, allowing document access and manipulation from Windows Explorer. Of course, there will be some serious integration questions to be answered, but there's the JavaSVN API available that should simplify things.
  2. Network drives will be mountable. This will partially address the problem of treating the DM system as another shared drive simply because there's so much legacy nonsense.
  3. A configurable document processing pipeline will be introduced to handle the process of adding/updating documents. Processors in the pipeline can include file conversion, text extraction, classification and notification.
  4. Categorization and relationships are important. Document collections are just a very simplistic form of document relationships. Thinking of using topic maps OR an rdf-based model to model the ontology and relationship between documents. Lots more thought has to be put into this. Basically the KM portion of the app.
  5. Daisy's concept of document=multiple binary parts + metadata will be used. Operations such as transformation or conversion ops can be performed on parts, depending on type. Multiple file formats of the same part will be supported. Thinking about how to map the relationship between parts, if present.
  6. Metadata is variable, but cannot be used to sort docs. Different document purposes will have different metadata templates. New document purposes can be created.
  7. Documents archived offline are retained as records in the database, but the binary parts have been moved to another location. You can include physical assets that cannot be added to the repository, such as a file box, microfiche or physical piece of evidence. These assets are represented in the repository using an electronic record and tracked using bar codes and scanners.
  8. LDAP should be supported for pluggable user management.
  9. Provide support for record management, the idea that a document has to be kept in archive for a certain period of time for accountability purposes, based on a retention policy.
  10. Support confidential comments to documents.

Xlibris etext in your hand

Posted by Kelvin on 03 Jan 2005 | Tagged as: life

Just wanna blog-log this link http://www.fxpal.com/?p=xlibris about a project related to my project on improving Screen Reading. It looks pretty cool, and requires use of a pen tablet (which is actually a good idea).

Anyway, interesting to take a look at it.

Building Daisy

Posted by Kelvin on 25 Nov 2004 | Tagged as: life

These are some changes I had to make in order to compile daisy from source. It's really not as simple as running maven from the root directory..

  1. Run maven from /lib to install jars to maven's local repo.
  2. Downloaded http://mirrors.combose.com/apache/avalon/avalon-meta/jars/avalon-meta-plugin-1.4.0.jar to MAVEN_HOME/plugins
  3. Downloaded http://mirrors.combose.com/apache/avalon/merlin/jars/merlin-plugin-3.3.0.jar to MAVEN_HOME/plugins
  4. Installed the Maven Torque plugin using maven plugin:download -DartifactId=maven-torque-plugin -DgroupId=torque -Dversion=3.1.1
  5. Renamed services/htmlcleaner/project.xml to project.xmld to remove it from build because javac compilation was failing.
  6. Emptied my USER_HOME/.maven/cache several times – a good strategy whenever things start going wrong with Maven plugins.

You should also ensure that maven.repo.remote is not somehow overriden in a build.properties in your USER_HOME. A good indication this is happening is if merlin-unit-xxx.jar breaks the build, because this is found at http://www.dpml.net which is declared in /project.properties.

Next Page »