<?xml version="1.0" encoding="utf-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:series="http://unfoldingneurons.com/"
		>
<channel>
	<title>Comments on: Dom4j + XPath + TagSoup - Namespaces = sweet!</title>
	<atom:link href="http://www.supermind.org/blog/613/dom4j-xpath-tagsoup-namespaces-sweet/feed" rel="self" type="application/rss+xml" />
	<link>http://www.supermind.org/blog/613/dom4j-xpath-tagsoup-namespaces-sweet</link>
	<description>A blog on Lucene, Solr, Nutch, crawling and vertical search</description>
	<lastBuildDate>Thu, 26 Aug 2010 08:00:19 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=abc</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Kelvin</title>
		<link>http://www.supermind.org/blog/613/dom4j-xpath-tagsoup-namespaces-sweet/comment-page-1#comment-17104</link>
		<dc:creator>Kelvin</dc:creator>
		<pubDate>Wed, 07 Jul 2010 05:05:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.supermind.org/?p=613#comment-17104</guid>
		<description>@Ken - that&#039;s a great idea with SAXReader.setXMLFilter(). Probably much cleaner than the method I posted about.</description>
		<content:encoded><![CDATA[<p>@Ken - that&#8217;s a great idea with SAXReader.setXMLFilter(). Probably much cleaner than the method I posted about.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ken Krugler</title>
		<link>http://www.supermind.org/blog/613/dom4j-xpath-tagsoup-namespaces-sweet/comment-page-1#comment-17103</link>
		<dc:creator>Ken Krugler</dc:creator>
		<pubDate>Wed, 07 Jul 2010 00:58:26 +0000</pubDate>
		<guid isPermaLink="false">http://www.supermind.org/?p=613#comment-17103</guid>
		<description>Hi Kelvin,

I&#039;ve solved this same issue two different ways in the past.

Easiest (brute force) is by calling SAXReader.setXMLFilter() with a filter, where that filter strips off namespaces.

The other approach is to use a utility routine that re-writes the XPath path with the required namespace identifier. Though I wound up having to also set the namespace context (XPath.setNamespaceContext(new SimpleNamespaceContext(map))) with a map from the identifier to the full xmlns://.

-- Ken</description>
		<content:encoded><![CDATA[<p>Hi Kelvin,</p>
<p>I&#8217;ve solved this same issue two different ways in the past.</p>
<p>Easiest (brute force) is by calling SAXReader.setXMLFilter() with a filter, where that filter strips off namespaces.</p>
<p>The other approach is to use a utility routine that re-writes the XPath path with the required namespace identifier. Though I wound up having to also set the namespace context (XPath.setNamespaceContext(new SimpleNamespaceContext(map))) with a map from the identifier to the full xmlns://.</p>
<p>-- Ken</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Extraer contenido HTML mediante consultas xpath con dom4j y Tagsoup &#171; Un Beagle y Yo</title>
		<link>http://www.supermind.org/blog/613/dom4j-xpath-tagsoup-namespaces-sweet/comment-page-1#comment-17021</link>
		<dc:creator>Extraer contenido HTML mediante consultas xpath con dom4j y Tagsoup &#171; Un Beagle y Yo</dc:creator>
		<pubDate>Mon, 17 May 2010 22:46:29 +0000</pubDate>
		<guid isPermaLink="false">http://www.supermind.org/?p=613#comment-17021</guid>
		<description>[...] entrada para eliminar el prefijo html de las [...]</description>
		<content:encoded><![CDATA[<div style="background-color:#E9F7F6;">
<p>[...] entrada para eliminar el prefijo html de las [...]</p>
</div>
]]></content:encoded>
	</item>
	<item>
		<title>By: Parsing Real World HTML with XPath support - Java Forums</title>
		<link>http://www.supermind.org/blog/613/dom4j-xpath-tagsoup-namespaces-sweet/comment-page-1#comment-17014</link>
		<dc:creator>Parsing Real World HTML with XPath support - Java Forums</dc:creator>
		<pubDate>Wed, 12 May 2010 09:46:23 +0000</pubDate>
		<guid isPermaLink="false">http://www.supermind.org/?p=613#comment-17014</guid>
		<description>[...] Using XPath on real-world HTML documents seems to work well except the following namespace problem: Dom4j + XPath + TagSoup &#8211; Namespaces = sweet! :: Kelvin Tan - Lucene Solr Nutch Consultant  It seems other parsers are available: Open Source HTML Parsers in Java  some of which support [...]</description>
		<content:encoded><![CDATA[<div style="background-color:#E9F7F6;">
<p>[...] Using XPath on real-world HTML documents seems to work well except the following namespace problem: Dom4j + XPath + TagSoup &#8211; Namespaces = sweet! :: Kelvin Tan - Lucene Solr Nutch Consultant  It seems other parsers are available: Open Source HTML Parsers in Java  some of which support [...]</p>
</div>
]]></content:encoded>
	</item>
	<item>
		<title>By: dis</title>
		<link>http://www.supermind.org/blog/613/dom4j-xpath-tagsoup-namespaces-sweet/comment-page-1#comment-17010</link>
		<dc:creator>dis</dc:creator>
		<pubDate>Sat, 08 May 2010 21:26:14 +0000</pubDate>
		<guid isPermaLink="false">http://www.supermind.org/?p=613#comment-17010</guid>
		<description>thanks for those parser&#039;s parameters. worked for me very well. you post saved my life!</description>
		<content:encoded><![CDATA[<p>thanks for those parser&#8217;s parameters. worked for me very well. you post saved my life!</p>
]]></content:encoded>
	</item>
</channel>
</rss>
