Misc changes from Sam.

2006-11-12 17:45:55 -06:00 · 2006-11-12 17:45:55 -06:00 · 838269ed8f
commit 838269ed8f
parent f940ab6af4 88fd1b80ca
84 changed files with 39985 additions and 276 deletions
--- a/.bzrignore
+++ b/.bzrignore
@ -1 +1,3 @@
 *.tmplc
 .DS_Store
 cache
--- a/167
+++ b/167
@ -1,167 +0,0 @@
 Installing Planet
 -----------------
 You'll need at least Python 2.2 installed on your system, we recommend
 Python 2.4 though as there may be bugs with the earlier libraries.
 Everything Pythonesque Planet need to provide basic operation should be
 included in the distribution.  Additionally:
  *  Usage of XSLT requires either xsltproc or python-libxslt.
  *  The current interface to filters written in non-templating languages
     (e.g., python) uses the subprocess module which was introduced in
     Python 2.4.
  *  Usage of FOAF as a reading list requires librdf.
 Instructions:
 i.
    First you'll need to extract the files into a folder somewhere.
    I expect you've already done this, after all, you're reading this
    file.  You can place this wherever you like, ~/planet is a good
    choice, but so's anywhere else you prefer.
 ii.
    This is very important: from within that directory, type the following
    command:
        python runtests.py
   This should take anywhere from a one to ten seconds to execute.  No network
   connection is required, and the script cleans up after itself.  If the
   script completes with an "OK", you are good to go.  Otherwise stopping here
   and inquiring on the mailing list is a good idea as it can save you lots of
   frustration down the road.
 iii.
    Make a copy of one of the 'ini' the files in the 'examples' subdirectory,
    and put them wherever you like; I like to use the Planet's name (so
    ~/planet/debian), but it's really up to you.
 iv.
    Edit the config.ini file in this directory to taste, it's pretty
    well documented so you shouldn't have any problems here.  Pay
    particular attention to the 'output_dir' option, which should be
    readable by your web server.  If the directory you specify in your
    'cache_dir' exists, make sure that it is empty.
 v.
    Run it: python planet.py pathto/config.ini
    You'll want to add this to cron, make sure you run it from the
    right directory.
 vi. (Optional)
    Tell us about it! We'd love to link to you on planetplanet.org :-)
 vii. (Optional)
    Build your own themes, templates, or filters!  And share!
 Template files
 --------------
 The template files used are given as a whitespace separated list in the
 'template_files' option in config.ini.  The extension at the end of the
 file name indicates what processor to use.  Templates may be implemented
 using htmltmpl, xslt, or any programming language.
 The final extension is removed to form the name of the file placed in the
 output directory.
 HtmlTmpl files
 --------------
 Reading through the example templates is recommended, they're designed to
 pretty much drop straight into your site with little modification
 anyway.
 Inside these template files, <TMPL_VAR xxx> is replaced with the content
 of the 'xxx' variable.  The variables available are:
 	name	....	} the value of the equivalent options
 	link	....	} from the [Planet] section of your
 	owner_name .	} Planet's config.ini file
 	owner_email	}
 	url	....	link with the output filename appended
 	generator ..	version of planet being used
 	date	....	                         { your date format
 	date_iso ...	current date and time in { ISO date format
 	date_822 ...	                         { RFC822 date format
 There are also two loops, 'Items' and 'Channels'.  All of the lines of
 the template and variable substitutions are available for each item or
 channel.  Loops are created using <TMPL_LOOP LoopName>...</TMPL_LOOP>
 and may be used as many times as you wish.
 The 'Channels' loop iterates all of the channels (feeds) defined in the
 configuration file, within it the following variables are available:
 	name	....	value of the 'name' option in config.ini, or title
 	title	....	title retreived from the channel's feed
 	tagline ....	description retreived from the channel's feed
 	link	....	link for the human-readable content (from the feed)
 	url	....	url of the channel's feed itself
 	Additionally the value of any other option specified in config.ini
 	for the feed, or in the [DEFAULT] section, is available as a
 	variable of the same name.
 	Depending on the feed, there may be a huge variety of other
 	variables may be available; the best way to find out what you
 	have is using the 'planet-cache' tool to examine your cache files.
 The 'Items' loop iterates all of the blog entries from all of the channels,
 you do not place it inside a 'Channels' loop.  Within it, the following
 variables are available:
 	id	....	unique id for this entry (sometimes just the link)
 	link	....	link to a human-readable version at the origin site
 	title	....	title of the entry
 	summary	....	a short "first page" summary
 	content	....	the full content of the entry
 	date	....	                              { your date format
 	date_iso ...	date and time of the entry in { ISO date format
 	date_822 ...                                  { RFC822 date format
 	If the entry takes place on a date that has no prior entry has
 	taken place on, the 'new_date' variable is set to that date.
 	This allows you to break up the page by day.
 	If the entry is from a different channel to the previous entry,
 	or is the first entry from this channel on this day
 	the 'new_channel' variable is set to the same value as the
 	'channel_url' variable.  This allows you to collate multiple
 	entries from the same person under the same banner.
 	Additionally the value of any variable that would be defined
 	for the channel is available, with 'channel_' prepended to the
 	name (e.g. 'channel_name' and 'channel_link').
 	Depending on the feed, there may be a huge variety of other
 	variables may be available; the best way to find out what you
 	have is using the 'planet-cache' tool to examine your cache files.
 There are also a couple of other special things you can do in a template.
 -  If you want HTML escaping applied to the value of a variable, use the
    <TMPL_VAR xxx ESCAPE="HTML"> form.
 -  If you want URI escaping applied to the value of a variable, use the
    <TMPL_VAR xxx ESCAPE="URI"> form.
 -  To only include a section of the template if the variable has a
    non-empty value, you can use <TMPL_IF xxx>....</TMPL_IF>.  e.g.
    <TMPL_IF new_date>
    <h1><TMPL_VAR new_date></h1>
    </TMPL_IF>
    You may place a <TMPL_ELSE> within this block to specify an
    alternative, or may use <TMPL_UNLESS xxx>...</TMPL_UNLESS> to
    perform the opposite.
--- a/8
+++ b/8
@ -9,11 +9,11 @@ also actively being maintained.
 It uses Mark Pilgrim's Universal Feed Parser to read from CDF, RDF, RSS and
 Atom feeds; Leonard Richardson's Beautiful Soup to correct markup issues;
-and Tomas Styblo's templating engine to output static files in any
+and either Tomas Styblo's templating engine Daniel Viellard's implementation
-format you can dream up.
+of XSLT to output static files in any format you can dream up.
-To get started, check out the INSTALL file in this directory.  If you have any
+To get started, check out the documentation in the docs directory.  If you have
-questions or comments, please don't hesitate to use the planet mailing list:
+any questions or comments, please don't hesitate to use the planet mailing list:
  http://lists.planetplanet.org/mailman/listinfo/devel
--- a/7
+++ b/7
@ -4,6 +4,13 @@ Elias Torres    - FOAF OnlineAccounts
 Jacques Distler - Template patches
 Michael Koziarski - HTTP Auth fix
 Brian Ewins     - Win32 / Portalocker
 Joe Gregorio    - Invoke same version of Python for filters
 Harry Fuecks    - Pipe characters in file names, filter bug
 Eric van der Vlist - Filters to add language, category information
 Chris Dolan     - mkdir cache; default template_dirs; fix xsltproc
 David Sifry     - rss 2.0 xslt template based on http://atom.geekhood.net/
 Morten Fredericksen - Support WordPress LinkManager OPML
 Harry Fuecks    - default item date to feed date
 This codebase represents a radical refactoring of Planet 2.0, which lists
 the following contributors:
--- a/5
+++ b/5
@ -1,11 +1,6 @@
 TODO
 ====
  * Enable per-feed adjustments
    The goal is to better cope with feeds that don't have dates or ids or
    consitently encode or escape things incorrectly.
  * Expire feed history
    The feed cache doesn't currently expire old entries, so could get
--- a/docs/config.html
+++ b/docs/config.html
@ -0,0 +1,140 @@
 <!DOCTYPE html PUBLIC
    "-//W3C//DTD XHTML 1.1 plus MathML 2.0 plus SVG 1.1//EN"
    "http://www.w3.org/2002/04/xhtml-math-svg/xhtml-math-svg.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml">
 <head>
 <script type="text/javascript" src="docs.js"></script>
 <link rel="stylesheet" type="text/css" href="docs.css"/>
 <title>Venus Configuration</title>
 </head>
 <body>
 <h2>Configuration</h2>
 <p>Configuration files are in <a href="http://docs.python.org/lib/module-
 ConfigParser.html">ConfigParser</a> format which basically means the same
 format as INI files, i.e., they consist of a series of
 <code>[sections]</code>, in square brackets, with each section containing a
 list of <code>name:value</code> pairs (or <code>name=value</code> pairs, if
 you prefer).</p>
 <p>You are welcome to place your entire configuration into one file.
 Alternately, you may factor out the templating into a "theme", and
 the list of subscriptions into one or more "reading lists".</p>
 <h3 id="planet"><code>[planet]</code></h3>
 <p>This is the only required section, which is a bit odd as none of the
 parameters listed below are required.  Even so, you really do want to 
 provide many of these, especially ones that identify your planet and
 either (or both) of <code>template_files</code> and <code>theme</code>.</p>
 <p>Below is a complete list of predefined planet configuration parameters,
 including <del>ones not (yet) implemented by Venus</del> and <ins>ones that
 are either new or implemented differently by Venus</ins>.</p>
 <blockquote>
 <dl class="compact code">
 <dt>name</dt>
 <dd>Your planet's name</dd>
 <dt>link</dt>
 <dd>Link to the main page</dd>
 <dt>owner_name</dt>
 <dd>Your name</dd>
 <dt>owner_email</dt>
 <dd>Your e-mail address</dd>
 </dl>
 <dl class="compact code">
 <dt>cache_directory</dt>
 <dd>Where cached feeds are stored</dd>
 <dt>output_dir</dt>
 <dd>Directory to place output files</dd>
 </dl>
 <dl class="compact code">
 <dt><ins>output_theme</ins></dt>
 <dd>Directory containing a <code>config.ini</code> file which is merged
 with this one.  This is typically used to specify templating and bill of
 material information.</dd>
 <dt>template_files</dt>
 <dd>Space-separated list of output template files</dd>
 <dt><ins>template_directories</ins></dt>
 <dd>Space-separated list of directories in which <code>template_files</code>
 can be found</dd>
 <dt><ins>bill_of_materials</ins></dt>
 <dd>Space-separated list of files to be copied as is directly from the <code>template_directories</code> to the <code>output_dir</code></dd>
 <dt><ins>filters</ins></dt>
 <dd>Space-separated list of filters to apply to each entry</dd>
 </dl>
 <dl class="compact code">
 <dt>items_per_page</dt>
 <dd>How many items to put on each page.  <ins>Whereas Planet 2.0 allows this to
 be overridden on a per template basis, Venus currently takes the maximum value
 for this across all templates.</ins></dd>
 <dt><del>days_per_page</del></dt>
 <dd>How many complete days of posts to put on each page This is the absolute, hard limit (over the item limit)</dd>
 <dt>date_format</dt>
 <dd><a href="http://docs.python.org/lib/module-time.html#l2h-2816">strftime</a> format for the default 'date' template variable</dd>
 <dt>new_date_format</dt>
 <dd><a href="http://docs.python.org/lib/module-time.html#l2h-2816">strftime</a> format for the 'new_date' template variable <ins>only applies to htmltmpl templates</ins></dd>
 <dt><del>encoding</del></dt>
 <dd>Output encoding for the file, Python 2.3+ users can use the special "xml" value to output ASCII with XML character references</dd>
 <dt><del>locale</del></dt>
 <dd>Locale to use for (e.g.) strings in dates, default is taken from your system</dd>
 <dt>activity_threshold</dt>
 <dd>If non-zero, all feeds which have not been updated in the indicated
 number of days will be marked as inactive</dd>
 </dl>
 <dl class="compact code">
 <dt>log_level</dt>
 <dd>One of <code>DEBUG</code>, <code>INFO</code>, <code>WARNING</code>, <code>ERROR</code> or <code>CRITICAL</code></dd>
 <dt><ins>log_format</ins></dt>
 <dd><a href="http://docs.python.org/lib/node422.html">format string</a> to
 use for logging output.  Note: this configuration value is processed
 <a href="http://docs.python.org/lib/ConfigParser-objects.html">raw</a></dd>
 <dt>feed_timeout</dt>
 <dd>Number of seconds to wait for any given feed</dd>
 <dt><del>new_feed_items</del></dt>
 <dd>Number of items to take from new feeds</dd>
 </dl>
 </blockquote>
 <h3 id="default"><code>[DEFAULT]</code></h3>
 <p>Values placed in this section are used as default values for all sections.
 While it is true that few values make sense in all sections; in most cases
 unused parameters cause few problems.</p>
 <h3 id="subscription"><code>[</code><em>subscription</em><code>]</code></h3>
 <p>All sections other than <code>planet</code>, <code>DEFAULT</code>, or are
 named in <code>[planet]</code>'s <code>filters</code> or
 <code>templatefiles</code> parameters
 are treated as subscriptions and typically take the form of a
 <acronym title="Uniform Resource Identifier">URI</acronym>.</p>
 <p>Parameters placed in this section are passed to templates.  While
 you are free to include as few or as many parameters as you like, most of
 the predefined themes presume that at least <code>name</code> is defined.</p>
 <p>The <code>content_type</code> parameter can be defined to indicate that
 this subscription is a <em>reading list</em>, i.e., is an external list
 of subscriptions.  At the moment, two formats of reading lists are supported:
 <code>opml</code> and <code>foaf</code>.  In the future, support for formats
 like <code>xoxo</code> could be added.</p>
 <p><a href="normalization.html#overrides">Normalization overrides</a> can
 also be defined here.</p>
 <h3 id="template"><code>[</code><em>template</em><code>]</code></h3>
 <p>Sections which are listed in <code>[planet] template_files</code> are
 processed as <a href="templates.html">templates</a>.  With Planet 2.0,
 it is possible to override parameters like <code>items_per_page</code>
 on a per template basis, but at the current time Planet Venus doesn't
 implement this.</p>
 <h3 id="filter"><code>[</code><em>filter</em><code>]</code></h3>
 <p>Sections which are listed in <code>[planet] filters</code> are
 processed as <a href="filters.html">filters</a>.</p>
 <p>Parameters which are listed in this section are passed to the filter
 in a language specific manner.  Given the way defaults work, filters
 should be prepared to ignore parameters that they didn't expect.</p>
 </body>
 </html>
--- a/docs/docs.css
+++ b/docs/docs.css
@ -0,0 +1,104 @@
 body {
  background-color: #fff;
  color: #333;
  font-family: 'Lucida Grande', Verdana, Geneva, Lucida, Helvetica, sans-serif;
  font-size: small;
  margin: 40px;
  padding: 0;
 }
 a:link, a:visited { 
  background-color: transparent;
  color: #333;
  text-decoration: none !important;
  border-bottom: 1px dotted #333 !important;
 }
 a:hover {
  background-color: transparent;
  color: #934;
  text-decoration: none !important;
  border-bottom: 1px dotted #993344 !important;
 }
 pre, code {
  background-color: #FFF;
  color: #00F;
  font-size: large
 }
 h1 {
  margin: 8px 0 10px 20px;
  padding: 0;
  font-variant: small-caps;
  letter-spacing: 0.1em;
  font-family: "Book Antiqua", Georgia, Palatino, Times, "Times New Roman", serif;
 }
 h2 {
  clear: both;
 }
 ul, ul.outer > li {
  margin: 14px 0 10px 0;
 }
 .z {
  float:left;
  background: url(img/shadowAlpha.png) no-repeat bottom right !important;
  margin: -15px 0 20px -15px !important;
 }
 .z .logo {
 color: magenta;
 }
 .z p {
  margin: 14px 0 10px 15px !important;
 }
 .z .sectionInner {
  width: 730px;
  background: none !important;
  padding: 0 !important;
  }
 .z .sectionInner .sectionInner2 {
  border: 1px solid #a9a9a9;
  padding: 4px;
  margin: -6px 6px 6px -6px !important;
 }
 ins {
  background-color: #FFF;
  color: #F0F;
  text-decoration: none;
 }
 dl.compact {
  margin-bottom: 1em;
  margin-top: 1em;
 }
 dl.code > dt {
  font-family: monospace;
  font-size: large;
 }
 dl.compact > dt {
  float: left;
  margin-bottom: 0;
  padding-right: 8px;
  margin-top: 0;
  list-style-type: none;
 }
 dl.compact > dd {
  margin-bottom: 0;
  margin-top: 0;
  margin-left: 10em;
 }
 th, td {
  font-size: small;
 }
--- a/docs/docs.js
+++ b/docs/docs.js
@ -0,0 +1,54 @@
 window.onload=function() {
  var vindex = document.URL.lastIndexOf('venus/');
  if (vindex<0) vindex = document.URL.lastIndexOf('planet/');
  var base = document.URL.substring(0,vindex+6);
  var body = document.getElementsByTagName('body')[0];
  var div = document.createElement('div');
  div.setAttribute('class','z');
  var h1 = document.createElement('h1');
  var span = document.createElement('span');
  span.appendChild(document.createTextNode('\u2640'));
  span.setAttribute('class','logo');
  h1.appendChild(span);
  h1.appendChild(document.createTextNode(' Planet Venus'));
  var inner2=document.createElement('div');
  inner2.setAttribute('class','sectionInner2');
  inner2.appendChild(h1);
  var p = document.createElement('p');
  p.appendChild(document.createTextNode("Planet Venus is an awesome \u2018river of news\u2019 feed reader. It downloads news feeds published by web sites and aggregates their content together into a single combined feed, latest news first."));
  inner2.appendChild(p);
  p = document.createElement('p');
  var a = document.createElement('a');
  a.setAttribute('href',base+'index.html');
  a.appendChild(document.createTextNode('Download'));
  p.appendChild(a);
  p.appendChild(document.createTextNode(" \u00b7 "));
  a = document.createElement('a');
  a.setAttribute('href',base+'docs/index.html');
  a.appendChild(document.createTextNode('Documentation'));
  p.appendChild(a);
  p.appendChild(document.createTextNode(" \u00b7 "));
  a = document.createElement('a');
  a.setAttribute('href',base+'tests/');
  a.appendChild(document.createTextNode('Unit tests'));
  p.appendChild(a);
  p.appendChild(document.createTextNode(" \u00b7 "));
  a = document.createElement('a');
  a.setAttribute('href','http://lists.planetplanet.org/mailman/listinfo/devel');
  a.appendChild(document.createTextNode('Mailing list'));
  p.appendChild(a);
  inner2.appendChild(p);
  var inner1=document.createElement('div');
  inner1.setAttribute('class','sectionInner');
  inner1.setAttribute('id','inner1');
  inner1.appendChild(inner2);
  div.appendChild(inner1);
  body.insertBefore(div, body.firstChild);
 }
--- a/docs/filters.html
+++ b/docs/filters.html
@ -0,0 +1,71 @@
 <!DOCTYPE html PUBLIC
    "-//W3C//DTD XHTML 1.1 plus MathML 2.0 plus SVG 1.1//EN"
    "http://www.w3.org/2002/04/xhtml-math-svg/xhtml-math-svg.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml">
 <head>
 <script type="text/javascript" src="docs.js"></script>
 <link rel="stylesheet" type="text/css" href="docs.css"/>
 <title>Venus Filters</title>
 </head>
 <body>
 <h2>Filters</h2>
 <p>Filters are simple Unix pipes.  Input comes in <code>stdin</code>,
 parameters come from the config file, and output goes to <code>stdout</code>.
 Anything written to <code>stderr</code> is logged as an ERROR message.  If no
 <code>stdout</code> is produced, the entry is not written to the cache or
 processed further.</p>
 <p>Input to a filter is a aggressively
 <a href="normalization.html">normalized</a> entry.  For
 example, if a feed is RSS 1.0 with 10 items, the filter will be called ten
 times, each with a single Atom 1.0 entry, with all textConstructs
 expressed as XHTML, and everything encoded as UTF-8.</p>
 <p>You will find a small set of example filters in the <a
 href="../filters">filters</a> directory.  The <a
 href="../filters/coral_cdn_filter.py">coral cdn filter</a> will change links
 to images in the entry itself.  The filters in the <a
 href="../filters/stripAd/">stripAd</a> subdirectory will strip specific
 types of advertisements that you may find in feeds.</p>
 <p>The <a href="../filters/excerpt.py">excerpt</a> filter adds metadata (in
 the form of a <code>planet:excerpt</code> element) to the feed itself.  You
 can see examples of how parameters are passed to this program in either
 <a href="../tests/data/filter/excerpt-images.ini">excerpt-images</a> or 
 <a href="../examples/opml-top100.ini">opml-top100.ini</a>.
 Alternately parameters may be passed
 <abbr title="Uniform Resource Identifier">URI</abbr> style, for example: 
 <a href="../tests/data/filter/excerpt-images2.ini">excerpt-images2</a>.
 </p>
 <p>The <a href="../filters/xpath_sifter.py">xpath sifter</a> is a variation of
 the above, including or excluding feeds based on the presence (or absence) of
 data specified by <a href="http://www.w3.org/TR/xpath20/">xpath</a>
 expressions.  Again, parameters can be passed as
 <a href="../tests/data/filter/xpath-sifter.ini">config options</a> or 
 <a href="../tests/data/filter/xpath-sifter2.ini">URI style</a>.
 </p>
 <h3>Notes</h3>
 <ul>
 <li>The file extension of the filter is significant.  <code>.py</code> invokes
 python. <code>.xslt</code> involkes XSLT.  <code>.sed</code> and
 <code>.tmpl</code> (a.k.a. htmltmp) are also options. Other languages, like
 perl or ruby or class/jar (java), aren't supported at the moment, but these
 would be easy to add.</li>
 <li>Any filters listed in the <code>[planet]</code> section of your config.ini
 will be invoked on all feeds.  Filters listed in individual
 <code>[feed]</code> sections will only be invoked on those feeds.</li>
 <li>Filters are simply invoked in the order they are listed in the
 configuration file (think unix pipes). Planet wide filters are executed before
 feed specific filters.</li>
 <li>Templates written using htmltmpl currently only have access to a fixed set
 of fields, whereas XSLT templates have access to everything.</li>
 </ul>
 </body>
 </html>
--- a/docs/img/shadowAlpha.png
+++ b/docs/img/shadowAlpha.png
--- a/docs/index.html
+++ b/docs/index.html
@ -0,0 +1,51 @@
 <!DOCTYPE html PUBLIC
    "-//W3C//DTD XHTML 1.1 plus MathML 2.0 plus SVG 1.1//EN"
    "http://www.w3.org/2002/04/xhtml-math-svg/xhtml-math-svg.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml">
 <head>
 <script type="text/javascript" src="docs.js"></script>
 <link rel="stylesheet" type="text/css" href="docs.css"/>
 <title>Venus Documentation</title>
 </head>
 <body>
 <h2>Table of Contents</h2>
 <ul class="outer">
 <li><a href="installation.html">Getting started</a></li>
 <li>Basic Features
 <ul>
 <li><a href="config.html">Configuration</a></li>
 <li><a href="templates.html">Templates</a></li>
 </ul>
 </li>
 <li>Advanced Features
 <ul>
 <li><a href="venus.svg">Architecture</a></li>
 <li><a href="normalization.html">Normalization</a></li>
 <li><a href="filters.html">Filters</a></li>
 </ul>
 </li>
 <li>Other
 <ul>
 <li><a href="migration.html">Migration from Planet 2.0</a></li>
 </ul>
 </li>
 <li>Reference
 <ul>
 <li><a href="http://www.planetplanet.org/">Planet</a></li>
 <li><a href="http://feedparser.org/docs/">Universal Feed Parser</a></li>
 <li><a href="http://www.crummy.com/software/BeautifulSoup/">Beautiful Soup</a></li>
 <li><a href="http://htmltmpl.sourceforge.net/">htmltmpl</a></li>
 <li><a href="http://www.w3.org/TR/xslt">XSLT</a></li>
 <li><a href="http://www.gnu.org/software/sed/manual/html_mono/sed.html">sed</a></li>
 </ul>
 </li>
 <li>Credits and License
 <ul>
 <li><a href="../AUTHORS">Authors</a></li>
 <li><a href="../THANKS">Contributors</a></li>
 <li><a href="../LICENCE">License</a></li>
 </ul>
 </li>
 </ul>
 </body>
 </html>
--- a/docs/installation.html
+++ b/docs/installation.html
@ -0,0 +1,112 @@
 <!DOCTYPE html PUBLIC
    "-//W3C//DTD XHTML 1.1 plus MathML 2.0 plus SVG 1.1//EN"
    "http://www.w3.org/2002/04/xhtml-math-svg/xhtml-math-svg.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml">
 <head>
 <script type="text/javascript" src="docs.js"></script>
 <link rel="stylesheet" type="text/css" href="docs.css"/>
 <title>Venus Installation</title>
 </head>
 <body>
 <h2>Installation</h2>
 <p>Venus has been tested on Linux, and Mac OSX, and Windows.</p>
 <p>You'll need at least Python 2.2 installed on your system, we recommend
 Python 2.4 though as there may be bugs with the earlier libraries.</p>
 <p>Everything Pythonesque Planet needs to provide basic operation should be
 included in the distribution.  Some optional features may require
 additional libraries, for example:</p>
 <ul>
 <li>Usage of XSLT requires either
 <a href="http://xmlsoft.org/XSLT/xsltproc2.html">xsltproc</a>
 or <a href="http://xmlsoft.org/XSLT/python.html">python-libxslt</a>.</li>
 <li>The current interface to filters written in non-templating languages
 (e.g., python) uses the
 <a href="http://docs.python.org/lib/module-subprocess.html">subprocess</a>
 module which was introduced in Python 2.4.</li>
 <li>Usage of FOAF as a reading list requires
 <a href="http://librdf.org/">librdf</a>.</li>
 </ul>
 <h3>General Instructions</h3>
 <p>
 These instructions apply to any platform.  Check the instructions
 below for more specific instructions for your platform.
 </p>
 <ol>
 <li><p>If you are reading this online, you will need to
 <a href="../index.html">download</a> and extract the files into a folder somewhere.
 You can place this wherever you like, <code>~/planet</code>
 and <code>~/venus</code> are good
 choices, but so's anywhere else you prefer.</p></li>
 <li><p>This is very important: from within that directory, type the following
 command:</p>
 <blockquote><code>python runtests.py</code></blockquote>
 <p>This should take anywhere from a one to ten seconds to execute.  No network
 connection is required, and the script cleans up after itself.  If the
 script completes with an "OK", you are good to go.  Otherwise stopping here
 and inquiring on the
 <a href="http://lists.planetplanet.org/mailman/listinfo/devel">mailing list</a>
 is a good idea as it can save you lots of frustration down the road.</p></li>
 <li><p>Make a copy of one of the <code>ini</code> the files in the
 <a href="../examples">examples</a> subdirectory,
 and put it wherever you like; I like to use the Planet's name (so
 <code>~/planet/debian</code>), but it's really up to you.</p></li>
 <li><p>Edit the <code>config.ini</code> file in this directory to taste,
 it's pretty well documented so you shouldn't have any problems here.  Pay
 particular attention to the <code>output_dir</code> option, which should be
 readable by your web server.  If the directory you specify in your
 <code>cache_dir</code> exists; make sure that it is empty.</p></li>
 <li><p>Run it: <code>python planet.py pathto/config.ini</code></p>
 <p>You'll want to add this to cron, make sure you run it from the
 right directory.</p></li>
 <li><p>(Optional)</p>
 <p>Tell us about it! We'd love to link to you on planetplanet.org :-)</p></li>
 <li><p>(Optional)</p>
 <p>Build your own themes, templates, or filters!  And share!</p></li>
 </ol>
 <h3>Mac OS X and Fink Instructions</h3>
 <p>
 The <a href="http://fink.sourceforge.net/">Fink Project</a> packages
 various open source software for MacOS.  This makes it a little easier
 to get started with projects like Planet Venus.
 </p>
 <p>
 Note: in the following, we recommend explicitly
 using <code>python2.4</code>.  As of this writing, Fink is starting to
 support <code>python2.5</code> but the XML libraries, for example, are
 not yet ported to the newer python so Venus will be less featureful.
 </p>
 <ol>
 <li><p>Install the XCode development tools from your Mac OS X install
        disks</p></li>
 <li><p><a href="http://fink.sourceforge.net/download/">Download</a>
        and install Fink</p></li>
 <li><p>Tell fink to install the Planet Venus prerequisites:<br />
        <code>fink install python24 celementtree-py24 bzr-py24 libxslt-py24
        libxml2-py24</code></p></li>
 <li><p><a href="../index.html">Download</a> and extract the Venus files into a
        folder somewhere</p></li>
 <li><p>Run the tests: <code>python2.4 runtests.py</code><br /> This
        will warn you that the RDF library is missing, but that's
        OK.</p></li>
 <li><p>Continue with the general steps above, starting with Step 3.  You
        may want to explicitly specify <code>python2.4</code>.</p></li>
 </ol>
 <h3>Ubuntu Linux (Edgy Eft) instructions</h3>
 <p>Before starting, issue the following command:</p>
 <ul>
 <li><code>sudo apt-get install bzr python2.4-librdf</code></li>
 </ul>
 </body>
 </html>
--- a/docs/migration.html
+++ b/docs/migration.html
@ -0,0 +1,42 @@
 <!DOCTYPE html PUBLIC
    "-//W3C//DTD XHTML 1.1 plus MathML 2.0 plus SVG 1.1//EN"
    "http://www.w3.org/2002/04/xhtml-math-svg/xhtml-math-svg.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml">
 <head>
 <script type="text/javascript" src="docs.js"></script>
 <link rel="stylesheet" type="text/css" href="docs.css"/>
 <title>Venus Migration</title>
 </head>
 <body>
 <h2>Migration from Planet 2.0</h2>
 <p>The intent is that existing Planet 2.0 users should be able to reuse
 their existing <code>config.ini</code> and <code>.tmpl</code> files,
 but the reality is that users will need to be aware of the following:</p>
 <ul>
 <li>You will need to start over with a new cache directory as the format
 of the cache has changed dramatically.</li>
 <li>Existing <code>.tmpl</code> and <code>.ini</code> files should work,
 though some <a href="config.html">configuration</a> options (e.g.,
 <code>days_per_page</code>) have not yet been implemented</li>
 <li>No testing has been done on Python 2.1, and it is presumed not to work.</li>
 <li>To take advantage of all features, you should install the optional
 XML and RDF libraries described on
 the <a href="installation.html">Installation</a> page.</li>
 </ul>
 <p>
 Common changes to config.ini include:
 </p>
 <ul>
 <li><p>Filename changes:</p>
 <pre>
 examples/fancy/index.html.tmpl => themes/classic_fancy/index.html.tmpl
 examples/atom.xml.tmpl         => themes/common/atom.xml.xslt
 examples/rss20.xml.tmpl        => themes/common/rss20.xml.tmpl
 examples/rss10.xml.tmpl        => themes/common/rss10.xml.tmpl
 examples/opml.xml.tmpl         => themes/common/opml.xml.xslt
 examples/foafroll.xml.tmpl     => themes/common/foafroll.xml.xslt
 </pre></li>
 </ul>
 </body>
 </html>
--- a/docs/normalization.html
+++ b/docs/normalization.html
@ -0,0 +1,92 @@
 <!DOCTYPE html PUBLIC
    "-//W3C//DTD XHTML 1.1 plus MathML 2.0 plus SVG 1.1//EN"
    "http://www.w3.org/2002/04/xhtml-math-svg/xhtml-math-svg.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml">
 <head>
 <script type="text/javascript" src="docs.js"></script>
 <link rel="stylesheet" type="text/css" href="docs.css"/>
 <title>Venus Normalization</title>
 </head>
 <body>
 <h2>Normalization</h2>
 <p>Venus builds on, and extends, the <a
 href="http://www.feedparser.org/">Universal Feed Parser</a> and <a
 href="http://www.crummy.com/software/BeautifulSoup/">BeautifulSoup</a> to
 convert all feeds into Atom 1.0, with well formed XHTML, and encoded as UTF-8,
 meaning that you don't have to worry about funky feeds, tag soup, or character
 encoding.</p>
 <h3>Encoding</h3>
 <p>Input data in feeds may be encoded in a variety of formats, most commonly
 ASCII, ISO-8859-1, WIN-1252, AND UTF-8.  Additionally, many feeds make use of
 the wide range of
 <a href="http://www.w3.org/TR/html401/sgml/entities.html">character entity
 references</a> provided by HTML.  Each is converted to UTF-8, an encoding
 which is a proper superset of ASCII, supports the entire range of Unicode
 characters, and is one of 
 <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#charsets">only two</a>
 encodings required to be supported by all conformant XML processors.</p>
 <p>Encoding problems are one of the more common feed errors, and every
 attempt is made to correct common errors, such as the inclusion of
 the so-called
 <a href="http://www.fourmilab.ch/webtools/demoroniser/">moronic</a> versions
 of smart-quotes.  In rare cases where individual characters can not be
 converted to valid UTF-8 or into
 <a href="http://www.w3.org/TR/xml/#charsets">characters allowed in XML 1.0
 documents</a>, such characters will be replaced with the Unicode
 <a href="http://www.fileformat.info/info/unicode/char/fffd/index.htm">Replacement character</a>, with a title that describes the original character whenever possible.</p>
 <p>In order to support the widest range of inputs, use of Python 2.3 or later,
 as well as the installation of the python <code>iconvcodec</code>, is
 recommended.</p>
 <h3>HTML</h3>
 <p>A number of different normalizations of HTML are performed.  For starters,
 the HTML is
 <a href="http://www.feedparser.org/docs/html-sanitization.html">sanitized</a>,
 meaning that HTML tags and attributes that could introduce javascript or
 other security risks are removed.</p>
 <p>Then,
 <a href="http://www.feedparser.org/docs/resolving-relative-links.html">relative
 links are resolved</a> within the HTML.  This is also done for links
 in other areas in the feed too.</p>
 <p>Finally, unmatched tags are closed.  This is done with a
 <a href="http://www.crummy.com/software/BeautifulSoup/documentation.html#Parsing%20HTML">knowledge of the semantics of HTML</a>.  Additionally, a
 <a href="http://golem.ph.utexas.edu/~distler/blog/archives/000165.html#sanitizespec">large
 subset of MathML</a>, as well as a
 <a href="http://www.w3.org/TR/SVGMobile/">tiny profile of SVG</a>
 is also supported.</p>
 <h3>Atom 1.0</h3>
 <p>The Universal Feed Parser also
 <a href="http://www.feedparser.org/docs/content-normalization.html">normalizes the content of feeds</a>.  This involves a
 <a href="http://www.feedparser.org/docs/reference.html">large number of elements</a>; the best place to start is to look at
 <a href="http://www.feedparser.org/docs/annotated-examples.html">annotated examples</a>.  Among other things a wide variety of
 <a href="http://www.feedparser.org/docs/date-parsing.html">date formats</a>
 are converted into
 <a href="http://www.ietf.org/rfc/rfc3339.txt">RFC 3339</a> formatted dates.</p>
 <p>If no <a href="http://www.feedparser.org/docs/reference-entry-id.html">ids</a> are found in entries, attempts are made to synthesize one using (in order):</p>
 <ul>
 <li><a href="http://www.feedparser.org/docs/reference-entry-link.html">link</a></li>
 <li><a href="http://www.feedparser.org/docs/reference-entry-title.html">title</a></li>
 <li><a href="http://www.feedparser.org/docs/reference-entry-summary.html">summary</a></li>
 <li><a href="http://www.feedparser.org/docs/reference-entry-content.html">content</a></li>
 </ul>
 <p>If no <a href="http://www.feedparser.org/docs/reference-feed-
 updated.html">updated</a> dates are found in an entry, or if the dates found
 are in the future, the current time is substituted.</p>
 <h3 id="overrides">Overrides</h3>
 <p>All of the above describes what Venus does automatically, either directly
 or through its dependencies.  There are a number of errors which can not
 be corrected automatically, and for these, there are configuration parameters
 that can be used to help.</p>
 <ul>
 <li><code>ignore_in_feed</code> allows you to list any number of elements
 or attributes which are to be ignored in feeds.  This is often handy in the
 case of feeds where the <code>id</code>, <code>updated</code> or
 <code>xml:lang</code> values can't be trusted.</li>
 <li><code>title_type</code>, <code>summary_type</code>,
 <code>content_type</code> allow you to override the 
 <a href="http://www.feedparser.org/docs/reference-entry-title_detail.html#reference.entry.title_detail.type"><code>type</code></a>
 attributes on these elements.</li>
 <li><code>name_type</code> does something similar for
 <a href="http://www.feedparser.org/docs/reference-entry-author_detail.html#reference.entry.author_detail.name">author names</a></li>
 </ul>
 </body>
 </html>
--- a/docs/templates.html
+++ b/docs/templates.html
@ -0,0 +1,129 @@
 <!DOCTYPE html PUBLIC
    "-//W3C//DTD XHTML 1.1 plus MathML 2.0 plus SVG 1.1//EN"
    "http://www.w3.org/2002/04/xhtml-math-svg/xhtml-math-svg.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml">
 <head>
 <script type="text/javascript" src="docs.js"></script>
 <link rel="stylesheet" type="text/css" href="docs.css"/>
 <title>Venus Templates</title>
 </head>
 <body>
 <h2>Templates</h2>
 <p>Template names take the form
 <em>name</em><code>.</code><em>ext</em><code>.</code><em>type</em>, where
 <em>name</em><code>.</code><em>ext</em> identifies the name of the output file
 to be created in the <code>output_directory</code>, and <em>type</em>
 indicates which language processor to use for the template.</p>
 <p>Like with <a href="filter.html">filters</a>, templates may be written
 in a variety of languages and are based on the standard Unix pipe convention
 of producing <code>stdout</code> from <code>stdin</code>, but in practice
 two languages are used more than others:</p>
 <h3>htmltmpl</h3>
 <p>Many find <a href="http://htmltmpl.sourceforge.net/">htmltmpl</a>
 easier to get started with as you can take a simple example of your
 output file, sprinkle in a few <code>&lt;TMPL_VAR&gt;</code>s and
 <code>&lt;TMPL_LOOP&gt;</code>s and you are done.  Eventually, however,
 you may find that your template involves <code>&lt;TMPL_IF&gt;</code>
 blocks inside of attribute values, and you may find the result difficult
 to read and create correctly.</p>
 <p>It is also important to note that htmltmpl based templates do not
 have access to the full set of information available in the feed, just
 the following (rather substantial) subset:</p>
 <blockquote>
 <table border="1" cellpadding="5" cellspacing="0">
 <tr><th>VAR</th><th>type</th><th>source</th></tr>
 <tr><td>author</td><td>String</td><td><a href="http://feedparser.org/docs/reference-feed-author.html">author</a></td></tr>
 <tr><td>author_name</td><td>String</td><td><a href="http://feedparser.org/docs/reference-feed-author_detail.html#reference.feed.author_detail.name">author_detail.name</a></td></tr>
 <tr><td>generator</td><td>String</td><td><a href="http://feedparser.org/docs/reference-feed-generator.html">generator</a></td></tr>
 <tr><td>id</td><td>String</td><td><a href="http://feedparser.org/docs/reference-feed-id.html">id</a></td></tr>
 <tr><td>icon</td><td>String</td><td><a href="http://feedparser.org/docs/reference-feed-icon.html">icon</a></td></tr>
 <tr><td>last_updated_822</td><td>Rfc822</td><td><a href="http://feedparser.org/docs/reference-feed-icon.html">updated_parsed</a></td></tr>
 <tr><td>last_updated_iso</td><td>Rfc3399</td><td><a href="http://feedparser.org/docs/reference-feed-icon.html">updated_parsed</a></td></tr>
 <tr><td>last_updated</td><td>PlanetDate</td><td><a href="http://feedparser.org/docs/reference-feed-icon.html">updated_parsed</a></td></tr>
 <tr><td>link</td><td>String</td><td><a href="http://feedparser.org/docs/reference-feed-link.html">link</a></td></tr>
 <tr><td>logo</td><td>String</td><td><a href="http://feedparser.org/docs/reference-feed-logo.html">logo</a></td></tr>
 <tr><td>rights</td><td>String</td><td><a href="http://feedparser.org/docs/reference-feed-rights_detail.html#reference.feed.rights_detail.value">rights_detail.value</a></td></tr>
 <tr><td>subtitle</td><td>String</td><td><a href="http://feedparser.org/docs/reference-feed-subtitle_detail.html#reference.feed.subtitle_detail.value">subtitle_detail.value</a></td></tr>
 <tr><td>title</td><td>String</td><td><a href="http://feedparser.org/docs/reference-feed-title_detail.html#reference.feed.title_detail.value">title_detail.value</a></td></tr>
 <tr><td>title_plain</td><td>Plain</td><td><a href="http://feedparser.org/docs/reference-feed-title_detail.html#reference.feed.title_detail.value">title_detail.value</a></td></tr>
 <tr><td rowspan="2">url</td><td rowspan="2">String</td><td><a href="http://feedparser.org/docs/reference-feed-links.html#reference.feed.links.href">links[rel='self'].href</a></td></tr>
 <tr><td><a href="http://feedparser.org/docs/reference-headers.html">headers['location']</a></td></tr>
 </table>
 </blockquote>
 <p>Note: when multiple sources are listed, the last one wins</p>
 <p>In addition to these variables, Planet Venus makes available two
 arrays, <code>Channels</code> and <code>Items</code>, with one entry
 per subscription and per output entry respectively.  The data values
 within the <code>Channels</code> array exactly match the above list.
 The data values within the <code>Items</code> array are as follows:</p>
 <blockquote>
 <table border="1" cellpadding="5" cellspacing="0">
 <tr><th>VAR</th><th>type</th><th>source</th></tr>
 <tr><td>author</td><td>String</td><td><a href="http://feedparser.org/docs/reference-entry-author.html">author</a></td></tr>
 <tr><td>author_email</td><td>String</td><td><a href="http://feedparser.org/docs/reference-entry-author_detail.html#reference.entry.author_detail.email">author_detail.email</a></td></tr>
 <tr><td>author_name</td><td>String</td><td><a href="http://feedparser.org/docs/reference-entry-author_detail.html#reference.entry.author_detail.name">author_detail.name</a></td></tr>
 <tr><td>author_uri</td><td>String</td><td><a href="http://feedparser.org/docs/reference-entry-author_detail.html#reference.entry.author_detail.href">author_detail.href</a></td></tr>
 <tr><td>content_language</td><td>String</td><td><a href="http://feedparser.org/docs/reference-entry-content.html#reference.entry.content.language">content[0].language</a></td></tr>
 <tr><td rowspan="2">content</td><td rowspan="2">String</td><td><a href="http://feedparser.org/docs/reference-entry-summary_detail.html#reference.entry.summary_detail.value">summary_detail.value</a></td></tr>
 <tr><td><a href="http://feedparser.org/docs/reference-entry-content.html#reference.entry.content.value">content[0].value</a></td></tr>
 <tr><td rowspan="2">date</td><td rowspan="2">PlanetDate</td><td><a href="http://feedparser.org/docs/reference-entry-published_parsed.html">published_parsed</a></td></tr>
 <tr><td><a href="http://feedparser.org/docs/reference-entry-updated_parsed.html">updated_parsed</a></td></tr>
 <tr><td rowspan="2">date_822</td><td rowspan="2">Rfc822</td><td><a href="http://feedparser.org/docs/reference-entry-published_parsed.html">published_parsed</a></td></tr>
 <tr><td><a href="http://feedparser.org/docs/reference-entry-updated_parsed.html">updated_parsed</a></td></tr>
 <tr><td rowspan="2">date_iso</td><td rowspan="2">Rfc3399</td><td><a href="http://feedparser.org/docs/reference-entry-published_parsed.html">published_parsed</a></td></tr>
 <tr><td><a href="http://feedparser.org/docs/reference-entry-updated_parsed.html">updated_parsed</a></td></tr>
 <tr><td><ins>enclosure_href</ins></td><td>String</td><td><a href="http://feedparser.org/docs/reference-entry-enclosures.html#reference.entry.enclosures.href">enclosures[0].href</a></td></tr>
 <tr><td><ins>enclosure_length</ins></td><td>String</td><td><a href="http://feedparser.org/docs/reference-entry-enclosures.html#reference.entry.enclosures.length">enclosures[0].length</a></td></tr>
 <tr><td><ins>enclosure_type</ins></td><td>String</td><td><a href="http://feedparser.org/docs/reference-entry-enclosures.html#reference.entry.enclosures.type">enclosures[0].type</a></td></tr>
 <tr><td><ins>guid_isPermaLink</ins></td><td>String</td><td><a href="http://blogs.law.harvard.edu/tech/rss#ltguidgtSubelementOfLtitemgt">isPermaLink</a></td></tr>
 <tr><td>id</td><td>String</td><td><a href="http://feedparser.org/docs/reference-entry-id.html">id</a></td></tr>
 <tr><td>link</td><td>String</td><td><a href="http://feedparser.org/docs/reference-entry-links.html#reference.entry.links.href">links[rel='alternate'].href</a></td></tr>
 <tr><td>new_channel</td><td>String</td><td><a href="http://feedparser.org/docs/reference-entry-id.html">id</a></td></tr>
 <tr><td rowspan="2">new_date</td><td rowspan="2">NewDate</td><td><a href="http://feedparser.org/docs/reference-entry-published_parsed.html">published_parsed</a></td></tr>
 <tr><td><a href="http://feedparser.org/docs/reference-entry-updated_parsed.html">updated_parsed</a></td></tr>
 <tr><td>rights</td><td>String</td><td><a href="http://feedparser.org/docs/reference-entry-rights_detail.html#reference.entry.rights_detail.value">rights_detail.value</a></td></tr>
 <tr><td>title_language</td><td>String</td><td><a href="http://feedparser.org/docs/reference-entry-title_detail.html#reference.entry.title_detail.language">title_detail.language</a></td></tr>
 <tr><td>title_plain</td><td>Plain</td><td><a href="http://feedparser.org/docs/reference-entry-title_detail.html#reference.entry.title_detail.value">title_detail.value</a></td></tr>
 <tr><td>title</td><td>String</td><td><a href="http://feedparser.org/docs/reference-entry-title_detail.html#reference.entry.title_detail.value">title_detail.value</a></td></tr>
 <tr><td>summary_language</td><td>String</td><td><a href="http://feedparser.org/docs/reference-entry-summary_detail.html#reference.entry.summary_detail.language">summary_detail.language</a></td></tr>
 <tr><td>updated</td><td>PlanetDate</td><td><a href="http://feedparser.org/docs/reference-entry-updated_parsed.html">updated_parsed</a></td></tr>
 <tr><td>updated_822</td><td>Rfc822</td><td><a href="http://feedparser.org/docs/reference-entry-updated_parsed.html">updated_parsed</a></td></tr>
 <tr><td>updated_iso</td><td>Rfc3399</td><td><a href="http://feedparser.org/docs/reference-entry-updated_parsed.html">updated_parsed</a></td></tr>
 <tr><td>published</td><td>PlanetDate</td><td><a href="http://feedparser.org/docs/reference-entry-published_parsed.html">published_parsed</a></td></tr>
 <tr><td>published_822</td><td>Rfc822</td><td><a href="http://feedparser.org/docs/reference-entry-published_parsed.html">published_parsed</a></td></tr>
 <tr><td>published_iso</td><td>Rfc3399</td><td><a href="http://feedparser.org/docs/reference-entry-published_parsed.html">published_parsed</a></td></tr>
 </table>
 </blockquote>
 <p>Note: variables above which start with
 <code>new_</code> are only set if their values differ from the previous
 Item.</p>
 <h3>xslt</h3>
 <p><a href="http://www.w3.org/TR/xslt">XSLT</a> is a paradox: it actually
 makes some simple things easier to do than htmltmpl, and certainly can
 make more difficult things possible; but it is fair to say that many
 find XSLT less approachable than htmltmpl.</p>
 <p>But in any case, the XSLT support is easier to document as the
 input is a <a href="normalization.html">highly normalized</a> feed,
 with a few extension elements.</p>
 <ul>
 <li><code>atom:feed</code> will have the following child elements:
 <ul>
 <li>A <code>planet:source</code> element per subscription, with the same child elements as <a href="http://www.atomenabled.org/developers/syndication/atom-format-spec.php#element.source"><code>atom:source</code></a>, as well as
 an additional child element in the planet namespace for each
 <a href="config.html#subscription">configuration parameter</a> that applies to
 this subscription.</li>
 <li><a href="http://www.feedparser.org/docs/reference-version.html"><code>planet:format</code></a> indicating the format and version of the source feed.</li>
 <li><a href="http://www.feedparser.org/docs/reference-bozo.html"><code>planet:bozo</code></a> which is either <code>true</code> or <code>false</code>.</li>
 </ul>
 </li>
 <li><code>atom:updated</code> and <code>atom:published</code> will have
 a <code>planet:format</code> attribute containing the referenced date
 formatted according to the <code>[planet] date_format</code> specified
 in the configuration</li>
 </ul>
 </body>
 </html>
--- a/examples/filters/categories/categories.xslt
+++ b/examples/filters/categories/categories.xslt
@ -0,0 +1,82 @@
 <?xml version="1.0" encoding="UTF-8"?>
 <!DOCTYPE xsl:stylesheet [
 <!ENTITY categoryTerm "WebSemantique">
 ]>
 <!-- 
  This transformation is released under the same licence as Python
  see http://www.intertwingly.net/code/venus/LICENCE.
  Author: Eric van der Vlist <vdv@dyomedea.com>
  This transformation is meant to be used as a filter that determines if
  Atom entries are relevant to a specific topic and adds the corresonding
  <category/> element when it is the case.
  This is done by a simple keyword matching mechanism.
  To customize this filter to your needs:
    1) Replace WebSemantique by your own category name in the definition of
        the categoryTerm entity above.
    2) Review the "upper" and "lower" variables that are used to convert text
        nodes to lower case and replace common ponctuation signs into spaces
        to check that they meet your needs.
    3) Define your own list of keywords in <d:keyword/> elements. Note that 
        the leading and trailing spaces are significant: "> rdf <" will match rdf
        as en entier word while ">rdf<" would match the substring "rdf" and
        "> rdf<" would match words starting by rdf. Also note that the test is done
        after conversion to lowercase.
  To use it with venus, just add this filter to the list of filters, for instance:
 filters= categories.xslt guess_language.py
 -->
 <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:atom="http://www.w3.org/2005/Atom" xmlns="http://www.w3.org/2005/Atom"
  xmlns:d="http://ns.websemantique.org/data/" exclude-result-prefixes="d atom" version="1.0">
  <xsl:variable name="upper"
    >,.;AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZzÀàÁáÂâÃãÄäÅåÆæÇçÈèÉéÊêËëÌìÍíÎîÏïÐðÑñÒòÓóÔôÕõÖöØøÙùÚúÛûÜüÝýÞþ</xsl:variable>
  <xsl:variable name="lower"
    >   aabbccddeeffgghhiijjkkllmmnnooppqqrrssttuuvvwwxxyyzzaaaaaaaaaaaaææcceeeeeeeeiiiiiiiiððnnooooooooooøøuuuuuuuuyyþþ</xsl:variable>
  <d:keywords>
    <d:keyword> wiki semantique </d:keyword>
    <d:keyword> wikis semantiques </d:keyword>
    <d:keyword> web semantique </d:keyword>
    <d:keyword> websemantique </d:keyword>
    <d:keyword> semantic web</d:keyword>
    <d:keyword> semweb</d:keyword>
    <d:keyword> rdf</d:keyword>
    <d:keyword> owl </d:keyword>
    <d:keyword> sparql </d:keyword>
    <d:keyword> topic map</d:keyword>
    <d:keyword> doap </d:keyword>
    <d:keyword> foaf </d:keyword>
    <d:keyword> sioc </d:keyword>
    <d:keyword> ontology </d:keyword>
    <d:keyword> ontologie</d:keyword>
    <d:keyword> dublin core </d:keyword>
  </d:keywords>
  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>
  <xsl:template match="atom:entry/atom:updated">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
    <xsl:variable name="concatenatedText">
      <xsl:for-each select="../atom:title|../atom:summary|../atom:content|../atom:category/@term">
        <xsl:text> </xsl:text>
        <xsl:value-of select="translate(., $upper, $lower)"/>
      </xsl:for-each>
      <xsl:text> </xsl:text>
    </xsl:variable>
    <xsl:if test="document('')/*/d:keywords/d:keyword[contains($concatenatedText, .)]">
      <category term="WebSemantique"/>
    </xsl:if>
  </xsl:template>
  <xsl:template match="atom:category[@term='&categoryTerm;']"/>
 </xsl:stylesheet>
--- a/examples/filters/guess-language/README
+++ b/examples/filters/guess-language/README
@ -0,0 +1,37 @@
 This filter is released under the same licence as Python
 see http://www.intertwingly.net/code/venus/LICENCE.
 Author: Eric van der Vlist <vdv@dyomedea.com>
 This filter guesses whether an Atom entry is written
 in English or French. It should be trivial to chose between
 two other languages, easy to extend to more than two languages
 and useful to pass these languages as Venus configuration
 parameters.
 The code used to guess the language is the one that has been
 described by Douglas Bagnall as the Python recipe titled
 "Language detection using character trigrams"
 http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/326576.
 To add support for a new language, this language must first be
 "learned" using learn-language.py. This learning phase is nothing
 more than saving a pickled version of the Trigram object for this
 language. 
 To learn Finnish, you would execute:
 $ ./learn-language.py http://gutenberg.net/dirs/1/0/4/9/10492/10492-8.txt fi.data
 where http://gutenberg.net/dirs/1/0/4/9/10492/10492-8.txt is a text
 representative of the Finnish language and "fi.data" is the name of the
 data file for "fi" (ISO code for Finnish).
 To install this filter, copy this directory under the Venus
 filter directory and declare it in your filters list, for instance:
 filters= categories.xslt guess-language/guess-language.py
 NOTE: this filter depends on Amara 
 (http://uche.ogbuji.net/tech/4suite/amara/)
--- a/examples/filters/guess-language/en.data
+++ b/examples/filters/guess-language/en.data
--- a/examples/filters/guess-language/fr.data
+++ b/examples/filters/guess-language/fr.data
--- a/examples/filters/guess-language/guess-language.py
+++ b/examples/filters/guess-language/guess-language.py
@ -0,0 +1,58 @@
 #!/usr/bin/env python
 """A filter to guess languages.
 This filter guesses whether an Atom entry is written
 in English or French. It should be trivial to chose between
 two other languages, easy to extend to more than two languages
 and useful to pass these languages as Venus configuration
 parameters.
 (See the REAME file for more details).
 Requires Python 2.1, recommends 2.4.
 """
 __authors__ = [ "Eric van der Vlist <vdv@dyomedea.com>"]
 __license__ = "Python"
 import amara
 from sys import stdin, stdout
 from trigram import Trigram
 from xml.dom import XML_NAMESPACE as XML_NS
 import cPickle
 ATOM_NSS = {
    u'atom': u'http://www.w3.org/2005/Atom',
    u'xml': XML_NS
 }
 langs = {}
 def tri(lang):
    if not langs.has_key(lang):
 	f = open('filters/guess-language/%s.data' % lang, 'r')
 	t = cPickle.load(f)
 	f.close()
 	langs[lang] = t
    return langs[lang]
 def guess_language(entry):
    text = u'';
    for child in entry.xml_xpath(u'atom:title|atom:summary|atom:content'):
 	text = text + u' '+ child.__unicode__()
    t = Trigram()
    t.parseString(text)
    if tri('fr') - t > tri('en') - t:
 	lang=u'en'
    else:
 	lang=u'fr'
    entry.xml_set_attribute((u'xml:lang', XML_NS), lang)
 def main():
    feed = amara.parse(stdin, prefixes=ATOM_NSS)
    for entry in feed.xml_xpath(u'//atom:entry[not(@xml:lang)]'):
 	guess_language(entry)
    feed.xml(stdout)
 if __name__ == '__main__':
    main()
--- a/examples/filters/guess-language/learn-language.py
+++ b/examples/filters/guess-language/learn-language.py
@ -0,0 +1,25 @@
 #!/usr/bin/env python
 """A filter to guess languages.
 This utility saves a Trigram object on file.
 (See the REAME file for more details).
 Requires Python 2.1, recommends 2.4.
 """
 __authors__ = [ "Eric van der Vlist <vdv@dyomedea.com>"]
 __license__ = "Python"
 from trigram import Trigram
 from sys import argv
 from cPickle import dump
 def main():
    tri = Trigram(argv[1])
    out = open(argv[2], 'w')
    dump(tri, out)
    out.close()
 if __name__ == '__main__':
    main()
--- a/examples/filters/guess-language/trigram.py
+++ b/examples/filters/guess-language/trigram.py
@ -0,0 +1,188 @@
 #!/usr/bin/python
 # -*- coding: UTF-8 -*-
 """
    This class is based on the Python recipe titled
    "Language detection using character trigrams"
    http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/326576
    by Douglas Bagnall.
    It has been (slightly) adapted by Eric van der Vlist to support
    Unicode and accept a method to parse strings.
 """
 __authors__ = [ "Douglas Bagnall", "Eric van der Vlist <vdv@dyomedea.com>"]
 __license__ = "Python"
 import random
 from urllib import urlopen
 class Trigram:
    """
    From one or more text files, the frequency of three character
    sequences is calculated.  When treated as a vector, this information
    can be compared to other trigrams, and the difference between them
    seen as an angle.  The cosine of this angle varies between 1 for
    complete similarity, and 0 for utter difference.  Since letter
    combinations are characteristic to a language, this can be used to
    determine the language of a body of text. For example:
        >>> reference_en = Trigram('/path/to/reference/text/english')
        >>> reference_de = Trigram('/path/to/reference/text/german')
        >>> unknown = Trigram('url://pointing/to/unknown/text')
        >>> unknown.similarity(reference_de)
        0.4
        >>> unknown.similarity(reference_en)
        0.95
    would indicate the unknown text is almost cetrtainly English.  As
    syntax sugar, the minus sign is overloaded to return the difference
    between texts, so the above objects would give you:
        >>> unknown - reference_de
        0.6
        >>> reference_en - unknown    # order doesn't matter.
        0.05
    As it stands, the Trigram ignores character set information, which
    means you can only accurately compare within a single encoding
    (iso-8859-1 in the examples).  A more complete implementation might
    convert to unicode first.
    As an extra bonus, there is a method to make up nonsense words in the
    style of the Trigram's text.
        >>> reference_en.makeWords(30)
        My withillonquiver and ald, by now wittlectionsurper, may sequia,
        tory, I ad my notter. Marriusbabilly She lady for rachalle spen
        hat knong al elf
    Beware when using urls: HTML won't be parsed out.
    Most methods chatter away to standard output, to let you know they're
    still there.
    """
    length = 0
    def __init__(self, fn=None):
        self.lut = {}
        if fn is not None:
            self.parseFile(fn)
    def _parseAFragment(self, line, pair='  '):
 	for letter in line:
 	    d = self.lut.setdefault(pair, {})
            d[letter] = d.get(letter, 0) + 1
            pair = pair[1] + letter
 	return pair
    def parseString(self, string):
 	self._parseAFragment(string)
        self.measure()
    def parseFile(self, fn, encoding="iso-8859-1"):
        pair = '  '
        if '://' in fn:
            #print "trying to fetch url, may take time..."
            f = urlopen(fn)
        else:
            f = open(fn)
        for z, line in enumerate(f):
            #if not z % 1000:
            #    print "line %s" % z
            # \n's are spurious in a prose context
            pair = self._parseAFragment(line.strip().decode(encoding) + ' ')
        f.close()
        self.measure()
    def measure(self):
        """calculates the scalar length of the trigram vector and
        stores it in self.length."""
        total = 0
        for y in self.lut.values():
            total += sum([ x * x for x in y.values() ])
        self.length = total ** 0.5
    def similarity(self, other):
        """returns a number between 0 and 1 indicating similarity.
        1 means an identical ratio of trigrams;
        0 means no trigrams in common.
        """
        if not isinstance(other, Trigram):
            raise TypeError("can't compare Trigram with non-Trigram")
        lut1 = self.lut
        lut2 = other.lut
        total = 0
        for k in lut1.keys():
            if k in lut2:
                a = lut1[k]
                b = lut2[k]
                for x in a:
                    if x in b:
                        total += a[x] * b[x]
        return float(total) / (self.length * other.length)
    def __sub__(self, other):
        """indicates difference between trigram sets; 1 is entirely
        different, 0 is entirely the same."""
        return 1 - self.similarity(other)
    def makeWords(self, count):
        """returns a string of made-up words based on the known text."""
        text = []
        k = '  '
        while count:
            n = self.likely(k)
            text.append(n)
            k = k[1] + n
            if n in ' \t':
                count -= 1
        return ''.join(text)
    def likely(self, k):
        """Returns a character likely to follow the given string
        two character string, or a space if nothing is found."""
        if k not in self.lut:
            return ' '
        # if you were using this a lot, caching would a good idea.
        letters = []
        for k, v in self.lut[k].items():
            letters.append(k * v)
        letters = ''.join(letters)
        return random.choice(letters)
 def test():
    en = Trigram('http://gutenberg.net/dirs/etext97/lsusn11.txt')
   #NB fr and some others have English license text.
    #   no has english excerpts.
    fr = Trigram('http://gutenberg.net/dirs/etext03/candi10.txt')
    fi = Trigram('http://gutenberg.net/dirs/1/0/4/9/10492/10492-8.txt')
    no = Trigram('http://gutenberg.net/dirs/1/2/8/4/12844/12844-8.txt')
    se = Trigram('http://gutenberg.net/dirs/1/0/1/1/10117/10117-8.txt')
    no2 = Trigram('http://gutenberg.net/dirs/1/3/0/4/13041/13041-8.txt')
    en2 = Trigram('http://gutenberg.net/dirs/etext05/cfgsh10.txt')
    fr2 = Trigram('http://gutenberg.net/dirs/1/3/7/0/13704/13704-8.txt')
    print "calculating difference:"
    print "en - fr is %s" % (en - fr)
    print "fr - en is %s" % (fr - en)
    print "en - en2 is %s" % (en - en2)
    print "en - fr2 is %s" % (en - fr2)
    print "fr - en2 is %s" % (fr - en2)
    print "fr - fr2 is %s" % (fr - fr2)
    print "fr2 - en2 is %s" % (fr2 - en2)
    print "fi - fr  is %s" % (fi - fr)
    print "fi - en  is %s" % (fi - en)
    print "fi - se  is %s" % (fi - se)
    print "no - se  is %s" % (no - se)
    print "en - no  is %s" % (en - no)
    print "no - no2  is %s" % (no - no2)
    print "se - no2  is %s" % (se - no2)
    print "en - no2  is %s" % (en - no2)
    print "fr - no2  is %s" % (fr - no2)
 if __name__ == '__main__':
    test()
--- a/planet.py
+++ b/planet.py
@ -20,6 +20,7 @@ if __name__ == "__main__":
    config_file = "config.ini"
    offline = 0
    verbose = 0
    only_if_new = 0
    for arg in sys.argv[1:]:
        if arg == "-h" or arg == "--help":
@ -29,12 +30,15 @@ if __name__ == "__main__":
            print " -v, --verbose       DEBUG level logging during update"
            print " -o, --offline       Update the Planet from the cache only"
            print " -h, --help          Display this help message and exit"
            print " -n, --only-if-new   Only spider new feeds"
            print
            sys.exit(0)
        elif arg == "-v" or arg == "--verbose":
            verbose = 1
        elif arg == "-o" or arg == "--offline":
            offline = 1
        elif arg == "-n" or arg == "--only-if-new":
            only_if_new = 1
        elif arg.startswith("-"):
            print >>sys.stderr, "Unknown option:", arg
            sys.exit(1)
@ -46,11 +50,11 @@ if __name__ == "__main__":
    if verbose:
        import planet
-        planet.getLogger('DEBUG')
+        planet.getLogger('DEBUG',config.log_format())
    if not offline:
        from planet import spider
-        spider.spiderPlanet()
+        spider.spiderPlanet(only_if_new=only_if_new)
    from planet import splice
    doc = splice.splice()
--- a/planet/init.py
+++ b/planet/init.py
@ -9,7 +9,7 @@ config.__init__()
 from ConfigParser import ConfigParser
 from urlparse import urljoin
-def getLogger(level):
+def getLogger(level, format):
    """ get a logger with the specified log level """
    global logger
    if logger: return logger
@ -19,7 +19,7 @@ def getLogger(level):
    except:
        import compat_logging as logging
-    logging.basicConfig()
+    logging.basicConfig(format=format)
    logging.getLogger().setLevel(logging.getLevelName(level))
    logger = logging.getLogger("planet.runner")
    try:
--- a/planet/compat_logging/init.py
+++ b/planet/compat_logging/init.py
@ -1090,7 +1090,7 @@ Logger.manager = Manager(Logger.root)
 BASIC_FORMAT = "%(levelname)s:%(name)s:%(message)s"
-def basicConfig():
+def basicConfig(format=BASIC_FORMAT):
    """
    Do basic configuration for the logging system by creating a
    StreamHandler with a default Formatter and adding it to the
@ -1098,7 +1098,7 @@ def basicConfig():
    """
    if len(root.handlers) == 0:
        hdlr = StreamHandler()
-        fmt = Formatter(BASIC_FORMAT)
+        fmt = Formatter(format)
        hdlr.setFormatter(fmt)
        root.addHandler(hdlr)
--- a/planet/config.py
+++ b/planet/config.py
@ -32,7 +32,7 @@ from urlparse import urljoin
 parser = ConfigParser()
-planet_predefined_options = []
+planet_predefined_options = ['filters']
 def __init__():
    """define the struture of an ini file"""
@ -43,6 +43,8 @@ def __init__():
        if section and parser.has_option(section, option):
            return parser.get(section, option)
        elif parser.has_option('Planet', option):
            if option == 'log_format':
                return parser.get('Planet', option, raw=True)
            return parser.get('Planet', option)
        else:
            return default
@ -69,8 +71,8 @@ def __init__():
        planet_predefined_options.append(name)
    # define a list planet-level variable
-    def define_planet_list(name):
+    def define_planet_list(name, default=''):
-        setattr(config, name, lambda : expand(get(None,name,'')))
+        setattr(config, name, lambda : expand(get(None,name,default)))
        planet_predefined_options.append(name)
    # define a string template-level variable
@ -88,6 +90,7 @@ def __init__():
    define_planet('link', '')
    define_planet('cache_directory', "cache")
    define_planet('log_level', "WARNING")
    define_planet('log_format', "%(levelname)s:%(name)s:%(message)s")
    define_planet('feed_timeout', 20)
    define_planet('date_format', "%B %d, %Y %I:%M %p")
    define_planet('new_date_format', "%B %d, %Y")
@ -100,7 +103,7 @@ def __init__():
    define_planet_list('template_files')
    define_planet_list('bill_of_materials')
-    define_planet_list('template_directories')
+    define_planet_list('template_directories', '.')
    define_planet_list('filter_directories')
    # template options
@ -123,7 +126,7 @@ def load(config_file):
    import config, planet
    from planet import opml, foaf
-    log = planet.getLogger(config.log_level())
+    log = planet.getLogger(config.log_level(),config.log_format())
    # Theme support
    theme = config.output_theme()
@ -146,10 +149,11 @@ def load(config_file):
                # complete search list for theme directories
                dirs += [os.path.join(theme_dir,dir) for dir in 
-                    config.template_directories()]
+                    config.template_directories() if dir not in dirs]
                # merge configurations, allowing current one to override theme
                template_files = config.template_files()
                parser.set('Planet','template_files','')
                parser.read(config_file)
                for file in config.bill_of_materials():
                    if not file in bom: bom.append(file)
@ -178,6 +182,12 @@ def load(config_file):
                    opml.opml2config(data, cached_config)
                elif content_type(list).find('foaf')>=0:
                    foaf.foaf2config(data, cached_config)
                else:
                    from planet import shell
                    import StringIO
                    cached_config.readfp(StringIO.StringIO(shell.run(
                        content_type(list), data.getvalue(), mode="filter")))
                if cached_config.sections() in [[], [list]]: 
                    raise Exception
@ -314,7 +324,7 @@ def reading_lists():
    for section in parser.sections():
        if parser.has_option(section, 'content_type'):
            type = parser.get(section, 'content_type')
-            if type.find('opml')>=0 or type.find('foaf')>=0:
+            if type.find('opml')>=0 or type.find('foaf')>=0 or type.find('.')>=0:
                result.append(section)
    return result
@ -328,7 +338,8 @@ def filters(section=None):
 def planet_options():
    """ dictionary of planet wide options"""
-    return dict(map(lambda opt: (opt, parser.get('Planet',opt)),
+    return dict(map(lambda opt: (opt,
        parser.get('Planet', opt, raw=(opt=="log_format"))),
        parser.options('Planet')))
 def feed_options(section):
--- a/planet/feedparser.py
+++ b/planet/feedparser.py
@ -11,7 +11,7 @@ Recommended: Python 2.3 or later
 Recommended: CJKCodecs and iconv_codec <http://cjkpython.i18n.org/>
 """
-__version__ = "4.2-pre-" + "$Revision: 1.142 $"[11:16] + "-cvs"
+__version__ = "4.2-pre-" + "$Revision: 1.144 $"[11:16] + "-cvs"
 __license__ = """Copyright (c) 2002-2006, Mark Pilgrim, All rights reserved.
 Redistribution and use in source and binary forms, with or without modification,
@ -218,6 +218,9 @@ class FeedParserDict(UserDict):
    def __getitem__(self, key):
        if key == 'category':
            return UserDict.__getitem__(self, 'tags')[0]['term']
        if key == 'enclosures':
            norel = lambda link: FeedParserDict([(name,value) for (name,value) in link.items() if name!='rel'])
            return [norel(link) for link in UserDict.__getitem__(self, 'links') if link['rel']=='enclosure']
        if key == 'categories':
            return [(tag['scheme'], tag['term']) for tag in UserDict.__getitem__(self, 'tags')]
        realkey = self.keymap.get(key, key)
@ -1303,15 +1306,15 @@ class _FeedParserMixin:
            attrsD.setdefault('type', 'application/atom+xml')
        else:
            attrsD.setdefault('type', 'text/html')
        context = self._getContext()
        attrsD = self._itsAnHrefDamnIt(attrsD)
        if attrsD.has_key('href'):
            attrsD['href'] = self.resolveURI(attrsD['href'])
            if attrsD.get('rel')=='enclosure' and not context.get('id'):
                context['id'] = attrsD.get('href')
        expectingText = self.infeed or self.inentry or self.insource
        context = self._getContext()
        context.setdefault('links', [])
        context['links'].append(FeedParserDict(attrsD))
        if attrsD['rel'] == 'enclosure':
            self._start_enclosure(attrsD)
        if attrsD.has_key('href'):
            expectingText = 0
            if (attrsD.get('rel') == 'alternate') and (self.mapContentType(attrsD.get('type')) in self.html_types):
@ -1357,6 +1360,7 @@ class _FeedParserMixin:
            self._start_content(attrsD)
        else:
            self.pushContent('description', attrsD, 'text/html', self.infeed or self.inentry or self.insource)
    _start_dc_description = _start_description
    def _start_abstract(self, attrsD):
        self.pushContent('description', attrsD, 'text/plain', self.infeed or self.inentry or self.insource)
@ -1368,6 +1372,7 @@ class _FeedParserMixin:
            value = self.popContent('description')
        self._summaryKey = None
    _end_abstract = _end_description
    _end_dc_description = _end_description
    def _start_info(self, attrsD):
        self.pushContent('info', attrsD, 'text/plain', 1)
@ -1427,7 +1432,8 @@ class _FeedParserMixin:
    def _start_enclosure(self, attrsD):
        attrsD = self._itsAnHrefDamnIt(attrsD)
        context = self._getContext()
-        context.setdefault('enclosures', []).append(FeedParserDict(attrsD))
+        attrsD['rel']='enclosure'
        context.setdefault('links', []).append(FeedParserDict(attrsD))
        href = attrsD.get('href')
        if href and not context.get('id'):
            context['id'] = href
--- a/planet/idindex.py
+++ b/planet/idindex.py
@ -0,0 +1,97 @@
 from glob import glob
 import os, sys, dbhash
 if __name__ == '__main__':
    rootdir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
    sys.path.insert(0, rootdir)
 from planet.spider import filename
 from planet import config
 def open():
    try:
        cache = config.cache_directory()
        index=os.path.join(cache,'index')
        if not os.path.exists(index): return None
        return dbhash.open(filename(index, 'id'),'w')
    except Exception, e:
        if e.__class__.__name__ == 'DBError': e = e.args[-1]
        from planet import logger as log
        log.error(str(e))
 def destroy():
    from planet import logger as log
    cache = config.cache_directory()
    index=os.path.join(cache,'index')
    if not os.path.exists(index): return None
    idindex = filename(index, 'id')
    if os.path.exists(idindex): os.unlink(idindex)
    os.removedirs(index)
    log.info(idindex + " deleted")
 def create():
    from planet import logger as log
    cache = config.cache_directory()
    index=os.path.join(cache,'index')
    if not os.path.exists(index): os.makedirs(index)
    index = dbhash.open(filename(index, 'id'),'c')
    try:
        import libxml2
    except:
        libxml2 = False
        from xml.dom import minidom
    for file in glob(cache+"/*"):
        if os.path.isdir(file):
            continue
        elif libxml2:
            try:
                doc = libxml2.parseFile(file)
                ctxt = doc.xpathNewContext()
                ctxt.xpathRegisterNs('atom','http://www.w3.org/2005/Atom')
                entry = ctxt.xpathEval('/atom:entry/atom:id')
                source = ctxt.xpathEval('/atom:entry/atom:source/atom:id')
                if entry and source:
                    index[filename('',entry[0].content)] = source[0].content
                doc.freeDoc()
            except:
                log.error(file)
        else:
            try:
                doc = minidom.parse(file)
                doc.normalize()
                ids = doc.getElementsByTagName('id')
                entry = [e for e in ids if e.parentNode.nodeName == 'entry']
                source = [e for e in ids if e.parentNode.nodeName == 'source']
                if entry and source:
                    index[filename('',entry[0].childNodes[0].nodeValue)] = \
                        source[0].childNodes[0].nodeValue
                doc.freeDoc()
            except:
                log.error(file)
    log.info(str(len(index.keys())) + " entries indexed")
    index.close()
    return open()
 if __name__ == '__main__':
    if len(sys.argv) < 2:
        print 'Usage: %s [-c|-d]' % sys.argv[0]
        sys.exit(1)
    config.load(sys.argv[1])
    if len(sys.argv) > 2 and sys.argv[2] == '-c':
        create()
    elif len(sys.argv) > 2 and sys.argv[2] == '-d':
        destroy()
    else:
        from planet import logger as log
        index = open()
        if index:
            log.info(str(len(index.keys())) + " entries indexed")
            index.close()
        else:
            log.info("no entries indexed")
--- a/planet/opml.py
+++ b/planet/opml.py
@ -48,6 +48,10 @@ class OpmlParser(ContentHandler,SGMLParser):
        # this is an entry in a subscription list, but some leave this
        # attribute off, and others have placed 'atom' in here
        if attrs.has_key('type'):
            if attrs['type'] == 'link' and not attrs.has_key('url'):
                # Auto-correct WordPress link manager OPML files
                attrs = dict(attrs.items())
                attrs['type'] = 'rss'
            if attrs['type'].lower() not in['rss','atom']: return
        # The feed itself is supposed to be in an attribute named 'xmlUrl'
--- a/planet/reconstitute.py
+++ b/planet/reconstitute.py
@ -25,7 +25,11 @@ illegal_xml_chars = re.compile("[\x01-\x08\x0B\x0C\x0E-\x1F]")
 def createTextElement(parent, name, value):
    """ utility function to create a child element with the specified text"""
    if not value: return
-    if isinstance(value,str): value=value.decode('utf-8')
+    if isinstance(value,str):
        try:
            value=value.decode('utf-8')
        except:
            value=value.decode('iso-8859-1')
    xdoc = parent.ownerDocument
    xelement = xdoc.createElement(name)
    xelement.appendChild(xdoc.createTextNode(value))
@ -100,6 +104,8 @@ def links(xentry, entry):
            xlink.setAttribute('type', link.get('type'))
        if link.has_key('rel'):
            xlink.setAttribute('rel', link.get('rel',None))
        if link.has_key('length'):
            xlink.setAttribute('length', link.get('length'))
        xentry.appendChild(xlink)
 def date(xentry, name, parsed):
@ -157,7 +163,7 @@ def content(xentry, name, detail, bozo):
        xcontent.setAttribute('type', 'html')
        xcontent.appendChild(xdoc.createTextNode(detail.value.decode('utf-8')))
-    if detail.language:
+    if detail.get("language"):
        xcontent.setAttribute('xml:lang', detail.language)
    xentry.appendChild(xcontent)
@ -170,13 +176,13 @@ def source(xsource, source, bozo, format):
    createTextElement(xsource, 'icon', source.get('icon', None))
    createTextElement(xsource, 'logo', source.get('logo', None))
    if not source.has_key('logo') and source.has_key('image'):
        createTextElement(xsource, 'logo', source.image.get('href',None))
    for tag in source.get('tags',[]):
        category(xsource, tag)
-    author_detail = source.get('author_detail',{})
+    author(xsource, 'author', source.get('author_detail',{}))
    if not author_detail.has_key('name') and source.has_key('planet_name'):
        author_detail['name'] = source['planet_name']
    author(xsource, 'author', author_detail)
    for contributor in source.get('contributors',[]):
        author(xsource, 'contributor', contributor)
@ -204,6 +210,8 @@ def reconstitute(feed, entry):
    if entry.has_key('language'):
        xentry.setAttribute('xml:lang', entry.language)
    elif feed.feed.has_key('language'):
        xentry.setAttribute('xml:lang', feed.feed.language)
    id(xentry, entry)
    links(xentry, entry)
@ -217,18 +225,46 @@ def reconstitute(feed, entry):
    content(xentry, 'content', entry.get('content',[None])[0], bozo)
    content(xentry, 'rights', entry.get('rights_detail',None), bozo)
-    date(xentry, 'updated', entry.get('updated_parsed',time.gmtime()))
+    date(xentry, 'updated', entry_updated(feed.feed, entry, time.gmtime()))
    date(xentry, 'published', entry.get('published_parsed',None))
    for tag in entry.get('tags',[]):
        category(xentry, tag)
-    author(xentry, 'author', entry.get('author_detail',None))
+    # known, simple text extensions
    for ns,name in [('feedburner','origlink')]:
        if entry.has_key('%s_%s' % (ns,name)) and \
            feed.namespaces.has_key(ns):
            xoriglink = createTextElement(xentry, '%s:%s' % (ns,name),
                entry['%s_%s' % (ns,name)])
            xoriglink.setAttribute('xmlns:%s' % ns, feed.namespaces[ns])
    author_detail = entry.get('author_detail',{})
    if author_detail and not author_detail.has_key('name') and \
        feed.feed.has_key('planet_name'):
        author_detail['name'] = feed.feed['planet_name']
    author(xentry, 'author', author_detail)
    for contributor in entry.get('contributors',[]):
        author(xentry, 'contributor', contributor)
    xsource = xdoc.createElement('source')
-    source(xsource, entry.get('source') or feed.feed, bozo, feed.version)
+    src = entry.get('source') or feed.feed
    src_author = src.get('author_detail',{})
    if (not author_detail or not author_detail.has_key('name')) and \
       not src_author.has_key('name') and  feed.feed.has_key('planet_name'):
       if src_author: src_author = src_author.__class__(src_author.copy())
       src['author_detail'] = src_author
       src_author['name'] = feed.feed['planet_name']
    source(xsource, src, bozo, feed.version)
    xentry.appendChild(xsource)
    return xdoc
 def entry_updated(feed, entry, default = None):
    chks = ((entry, 'updated_parsed'),
            (entry, 'published_parsed'),
            (feed,  'updated_parsed'),)
    for node, field in chks:
        if node.has_key(field) and node[field]:
            return node[field]
    return default
--- a/planet/shell/init.py
+++ b/planet/shell/init.py
@ -6,13 +6,21 @@ logged_modes = []
 def run(template_file, doc, mode='template'):
    """ select a template module based on file extension and execute it """
-    log = planet.getLogger(planet.config.log_level())
+    log = planet.getLogger(planet.config.log_level(),planet.config.log_format())
    if mode == 'template':
        dirs = planet.config.template_directories()
    else:
        dirs = planet.config.filter_directories()
    # parse out "extra" options
    if template_file.find('?') < 0:
        extra_options = {}
    else:
        import cgi
        template_file, extra_options = template_file.split('?',1)
        extra_options = dict(cgi.parse_qsl(extra_options))
    # see if the template can be located
    for template_dir in dirs:
        template_resolved = os.path.join(template_dir, template_file)
@ -43,6 +51,7 @@ def run(template_file, doc, mode='template'):
    # Execute the shell module
    options = planet.config.template_options(template_file)
    options.update(extra_options)
    log.debug("Processing %s %s using %s", mode,
        os.path.realpath(template_resolved), module_name)
    if mode == 'filter':
--- a/planet/shell/tmpl.py
+++ b/planet/shell/tmpl.py
@ -97,6 +97,9 @@ Items = [
    ['date_822', Rfc822, 'updated_parsed'],
    ['date_iso', Rfc3399, 'published_parsed'],
    ['date_iso', Rfc3399, 'updated_parsed'],
    ['enclosure_href', String, 'links', {'rel': 'enclosure'}, 'href'],
    ['enclosure_length', String, 'links', {'rel': 'enclosure'}, 'length'],
    ['enclosure_type', String, 'links', {'rel': 'enclosure'}, 'type'],
    ['id', String, 'id'],
    ['link', String, 'links', {'rel': 'alternate'}, 'href'],
    ['new_channel', String, 'id'],
@ -190,6 +193,13 @@ def template_info(source):
    for entry in data.entries:
        output['Items'].append(tmpl_mapper(entry, Items))
    # synthesize isPermaLink attribute
    for item in output['Items']:
        if item.get('id') == item.get('link'):
            item['guid_isPermaLink']='true'
        else:
            item['guid_isPermaLink']='false'
    # feed level information
    output['generator'] = config.generator_uri()
    output['name'] = config.name()
--- a/planet/shell/xslt.py
+++ b/planet/shell/xslt.py
@ -1,5 +1,19 @@
 import os
 def quote(string, apos):
    """ quote a string so that it can be passed as a parameter """
    if type(string) == unicode:
        string=string.encode('utf-8')
    if apos.startswith("\\"): string.replace('\\','\\\\')
    if string.find("'")<0:
        return "'" + string + "'"
    elif string.find("'")<0:
        return '"' + string + '"'
    else:
        # unclear how to quote strings with both types of quotes for libxslt
        return "'" + string.replace("'",apos) + "'"
 def run(script, doc, output_file=None, options={}):
    """ process an XSLT stylesheet """
@ -12,6 +26,22 @@ def run(script, doc, output_file=None, options={}):
    except:
        # otherwise, use the command line interface
        dom = None
    # do it
    result = None
    if dom:
        styledoc = libxml2.parseFile(script)
        style = libxslt.parseStylesheetDoc(styledoc)
        for key in options.keys():
            options[key] = quote(options[key], apos="\xe2\x80\x99")
        output = style.applyStylesheet(dom, options)
        if output_file:
            style.saveResultToFilename(output_file, output, 0)
        else:
            result = str(output)
        style.freeStylesheet()
        output.freeDoc()
    elif output_file:
        import warnings
        if hasattr(warnings, 'simplefilter'):
            warnings.simplefilter('ignore', RuntimeWarning)
@ -20,16 +50,28 @@ def run(script, doc, output_file=None, options={}):
        file.write(doc)
        file.close()
-    # do it
+        cmdopts = []
-    if dom:
+        for key,value in options.items():
-        styledoc = libxml2.parseFile(script)
+           cmdopts += ['--stringparam', key, quote(value, apos=r"\'")]
-        style = libxslt.parseStylesheetDoc(styledoc)
+
-        result = style.applyStylesheet(dom, None)
+        os.system('xsltproc %s %s %s > %s' %
-        style.saveResultToFilename(output_file, result, 0)
+            (' '.join(cmdopts), script, docfile, output_file))
-        style.freeStylesheet()
+        os.unlink(docfile)
        result.freeDoc()
    else:
-        os.system('xsltproc %s %s > %s' % (script, docfile, output_file))
+        import sys
        from subprocess import Popen, PIPE
        options = sum([['--stringparam', key, value]
            for key,value in options.items()], [])
        proc = Popen(['xsltproc'] + options + [script, '-'],
            stdin=PIPE, stdout=PIPE, stderr=PIPE)
        result, stderr = proc.communicate(doc)
        if stderr:
            import planet
            planet.logger.error(stderr)
    if dom: dom.freeDoc()
-    if docfile: os.unlink(docfile)
+
    return result
--- a/planet/spider.py
+++ b/planet/spider.py
@ -11,10 +11,12 @@ import planet, config, feedparser, reconstitute, shell
 # Regular expressions to sanitise cache filenames
 re_url_scheme    = re.compile(r'^\w+:/*(\w+:|www\.)?')
-re_slash         = re.compile(r'[?/:]+')
+re_slash         = re.compile(r'[?/:|]+')
 re_initial_cruft = re.compile(r'^[,.]*')
 re_final_cruft   = re.compile(r'[,.]*$')
 index = True
 def filename(directory, filename):
    """Return a filename suitable for the cache.
@ -29,6 +31,8 @@ def filename(directory, filename):
                filename=filename.encode('idna')
    except:
        pass
    if isinstance(filename,unicode):
        filename=filename.encode('utf-8')
    filename = re_url_scheme.sub("", filename)
    filename = re_slash.sub(",", filename)
    filename = re_initial_cruft.sub("", filename)
@ -59,10 +63,16 @@ def scrub(feed, data):
    # some data is not trustworthy
    for tag in config.ignore_in_feed(feed).split():
        if tag.find('lang')>=0: tag='language'
        if data.feed.has_key(tag): del data.feed[tag]
        for entry in data.entries:
            if entry.has_key(tag): del entry[tag]
            if entry.has_key(tag + "_detail"): del entry[tag + "_detail"]
            if entry.has_key(tag + "_parsed"): del entry[tag + "_parsed"]
            for key in entry.keys():
                if not key.endswith('_detail'): continue
                for detail in entry[key].copy():
                    if detail == tag: del entry[key][detail]
    # adjust title types
    if config.title_type(feed):
@ -107,15 +117,22 @@ def scrub(feed, data):
                    source.author_detail['name'] = \
                        str(stripHtml(source.author_detail.name))
-def spiderFeed(feed):
+def spiderFeed(feed, only_if_new=0):
    """ Spider (fetch) a single feed """
    log = planet.logger
    # read cached feed info
    sources = config.cache_sources_directory()
    if not os.path.exists(sources):
        os.makedirs(sources, 0700)
    feed_source = filename(sources, feed)
    feed_info = feedparser.parse(feed_source)
-    if feed_info.feed.get('planet_http_status',None) == '410': return
+    if feed_info.feed and only_if_new:
        log.info("Feed %s already in cache", feed)
        return
    if feed_info.feed.get('planet_http_status',None) == '410':
        log.info("Feed %s gone", feed)
        return
    # read feed itself
    modified = None
@ -142,6 +159,10 @@ def spiderFeed(feed):
    # process based on the HTTP status code
    if data.status == 200 and data.has_key("url"):
        data.feed['planet_http_location'] = data.url
        if feed == data.url:
            log.info("Updating feed %s", feed)
        else:
            log.info("Updating feed %s @ %s", feed, data.url)
    elif data.status == 301 and data.has_key("entries") and len(data.entries)>0:
        log.warning("Feed has moved from <%s> to <%s>", feed, data.url)
        data.feed['planet_http_location'] = data.url
@ -171,6 +192,7 @@ def spiderFeed(feed):
    if not data.version and feed_info.version:
        data.feed = feed_info.feed
        data.bozo = feed_info.feed.get('planet_bozo','true') == 'true'
        data.version = feed_info.feed.get('planet_format')
    data.feed['planet_http_status'] = str(data.status)
    # capture etag and last-modified information
@ -184,18 +206,28 @@ def spiderFeed(feed):
                data.feed['planet_http_last_modified'])
    # capture feed and data from the planet configuration file
-    if not data.feed.has_key('links'): data.feed['links'] = list()
+    if data.version:
-    for link in data.feed.links:
+        if not data.feed.has_key('links'): data.feed['links'] = list()
-        if link.rel == 'self': break
+        feedtype = 'application/atom+xml'
-    else:
+        if data.version.startswith('rss'): feedtype = 'application/rss+xml'
-        data.feed.links.append(feedparser.FeedParserDict(
+        if data.version in ['rss090','rss10']: feedtype = 'application/rdf+xml'
-            {'rel':'self', 'type':'application/atom+xml', 'href':feed}))
+        for link in data.feed.links:
            if link.rel == 'self':
                link['type'] = feedtype
                break
        else:
            data.feed.links.append(feedparser.FeedParserDict(
                {'rel':'self', 'type':feedtype, 'href':feed}))
    for name, value in config.feed_options(feed).items():
        data.feed['planet_'+name] = value
    # perform user configured scrub operations on the data
    scrub(feed, data)
    from planet import idindex
    global index
    if index != None: index = idindex.open()
    # write each entry to the cache
    cache = config.cache_directory()
    for entry in data.entries:
@ -211,16 +243,20 @@ def spiderFeed(feed):
        mtime = None
        if not entry.has_key('updated_parsed'):
            if entry.has_key('published_parsed'):
-                entry['updated_parsed'] = entry.published_parsed
+                entry['updated_parsed'] = entry['published_parsed']
-        if entry.has_key('updated_parsed'):
+        if not entry.has_key('updated_parsed'):
-            mtime = calendar.timegm(entry.updated_parsed)
+            try:
-            if mtime > time.time(): mtime = None
+                mtime = calendar.timegm(entry.updated_parsed)
            except:
                pass
        if not mtime:
            try:
                mtime = os.stat(cache_file).st_mtime
            except:
-                mtime = time.time()
+                if data.feed.has_key('updated_parsed'):
-            entry['updated_parsed'] = time.gmtime(mtime)
+                    mtime = calendar.timegm(data.feed.updated_parsed)
        if not mtime or mtime > time.time(): mtime = time.time()
        entry['updated_parsed'] = time.gmtime(mtime)
        # apply any filters
        xdoc = reconstitute.reconstitute(data, entry)
@ -228,12 +264,22 @@ def spiderFeed(feed):
        xdoc.unlink()
        for filter in config.filters(feed):
            output = shell.run(filter, output, mode="filter")
-            if not output: return
+            if not output: break
        if not output: continue
        # write out and timestamp the results
        write(output, cache_file) 
        os.utime(cache_file, (mtime, mtime))
        # optionally index
        if index != None: 
            feedid = data.feed.get('id', data.feed.get('link',None))
            if feedid:
                if type(feedid) == unicode: feedid = feedid.encode('utf-8')
                index[filename('', entry.id)] = feedid
    if index: index.close()
    # identify inactive feeds
    if config.activity_threshold(feed):
        updated = [entry.updated_parsed for entry in data.entries
@ -254,6 +300,8 @@ def spiderFeed(feed):
    # report channel level errors
    if data.status == 226:
        if data.feed.has_key('planet_message'): del data.feed['planet_message']
        if feed_info.feed.has_key('planet_updated'):
            data.feed['planet_updated'] = feed_info.feed['planet_updated']
    elif data.status == 403:
        data.feed['planet_message'] = "403: forbidden"
    elif data.status == 404:
@ -275,14 +323,17 @@ def spiderFeed(feed):
    write(xdoc.toxml('utf-8'), filename(sources, feed))
    xdoc.unlink()
-def spiderPlanet():
+def spiderPlanet(only_if_new = False):
    """ Spider (fetch) an entire planet """
-    log = planet.getLogger(config.log_level())
+    log = planet.getLogger(config.log_level(),config.log_format())
    planet.setTimeout(config.feed_timeout())
    global index
    index = True
    for feed in config.subscriptions():
        try:
-            spiderFeed(feed)
+            spiderFeed(feed, only_if_new=only_if_new)
        except Exception,e:
            import sys, traceback
            type, value, tb = sys.exc_info()
--- a/planet/splice.py
+++ b/planet/splice.py
@ -4,11 +4,12 @@ from xml.dom import minidom
 import planet, config, feedparser, reconstitute, shell
 from reconstitute import createTextElement, date
 from spider import filename
 from planet import idindex
 def splice():
    """ Splice together a planet from a cache of entries """
    import planet
-    log = planet.getLogger(config.log_level())
+    log = planet.getLogger(config.log_level(),config.log_format())
    log.info("Loading cached data")
    cache = config.cache_directory()
@ -62,9 +63,15 @@ def splice():
        reconstitute.source(xdoc.documentElement, data.feed, None, None)
        feed.appendChild(xdoc.documentElement)
    index = idindex.open()
    # insert entry information
    items = 0
    for mtime,file in dir:
        if index:
            base = file.split('/')[-1]
            if index.has_key(base) and index[base] not in sub_ids: continue
        try:
            entry=minidom.parse(file)
@ -83,12 +90,14 @@ def splice():
        except:
            log.error("Error parsing %s", file)
    if index: index.close()
    return doc
 def apply(doc):
    output_dir = config.output_dir()
    if not os.path.exists(output_dir): os.makedirs(output_dir)
-    log = planet.getLogger(config.log_level())
+    log = planet.getLogger(config.log_level(),config.log_format())
    # Go-go-gadget-template
    for template_file in config.template_files():
--- a/runtests.py
+++ b/runtests.py
@ -23,7 +23,7 @@ modules = map(fullmodname, glob.glob(os.path.join('tests', 'test_*.py')))
 # enable warnings
 import planet
-planet.getLogger("WARNING")
+planet.getLogger("WARNING",None)
 # load all of the tests into a suite
 try:
@ -33,5 +33,11 @@ except Exception, exception:
    for module in modules: __import__(module)
    raise
 verbosity = 1
 if "-q" in sys.argv or '--quiet' in sys.argv:
    verbosity = 0
 if "-v" in sys.argv or '--verbose' in sys.argv:
    verbosity = 2
 # run test suite
-unittest.TextTestRunner().run(suite)
+unittest.TextTestRunner(verbosity=verbosity).run(suite)
--- a/tests/capture.py
+++ b/tests/capture.py
@ -18,9 +18,10 @@ os.chdir(sys.path[0])
 # copy spider output to splice input
 import planet
 from planet import spider, config
-planet.getLogger('CRITICAL')
+planet.getLogger('CRITICAL',None)
-spider.spiderPlanet('tests/data/spider/config.ini')
+config.load('tests/data/spider/config.ini')
 spider.spiderPlanet()
 if os.path.exists('tests/data/splice/cache'):
    shutil.rmtree('tests/data/splice/cache')
 shutil.move('tests/work/spider/cache', 'tests/data/splice/cache')
@ -31,7 +32,7 @@ dest1.write(source.read().replace('/work/spider/', '/data/splice/'))
 dest1.close()
 source.seek(0)
-dest2=open('tests/data/apply/config.ini', 'w')
+dest2=open('tests/work/apply_config.ini', 'w')
 dest2.write(source.read().replace('[Planet]', '''[Planet]
 output_theme = asf
 output_dir = tests/work/apply'''))
@ -41,12 +42,13 @@ source.close()
 # copy splice output to apply input
 from planet import splice
 file=open('tests/data/apply/feed.xml', 'w')
-data=splice.splice('tests/data/splice/config.ini').toxml('utf-8')
+config.load('tests/data/splice/config.ini')
 data=splice.splice().toxml('utf-8')
 file.write(data)
 file.close()
 # copy apply output to config/reading-list input
-config.load('tests/data/apply/config.ini')
+config.load('tests/work/apply_config.ini')
 splice.apply(data)
 shutil.move('tests/work/apply/opml.xml', 'tests/data/config')
--- a/tests/data/apply/feed.xml
+++ b/tests/data/apply/feed.xml
--- a/tests/data/config/opml.xml
+++ b/tests/data/config/opml.xml
@ -1,8 +1,8 @@
 <?xml version="1.0"?>
-<opml xmlns="http://www.w3.org/1999/xhtml" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/" version="1.1">
+<opml version="1.1">
  <head>
    <title>test planet</title>
-    <dateModified>August 25, 2006 01:41 PM</dateModified>
+    <dateModified>October 14, 2006 01:02 PM</dateModified>
    <ownerName>Anonymous Coward</ownerName>
    <ownerEmail></ownerEmail>
  </head>
--- a/tests/data/filter/excerpt-images2.ini
+++ b/tests/data/filter/excerpt-images2.ini
@ -0,0 +1,2 @@
 [Planet]
 filters = excerpt.py?omit=img
--- a/tests/data/filter/tmpl/enclosure_href.xml
+++ b/tests/data/filter/tmpl/enclosure_href.xml
@ -0,0 +1,11 @@
 <!--
 Description:  link relationship
 Expect:       Items[0]['enclosure_href'] == 'http://example.com/music.mp3'
 -->
 <feed xmlns="http://www.w3.org/2005/Atom">
  <entry>
    <link rel="enclosure" href="http://example.com/music.mp3"/>
  </entry>
 </feed>
--- a/tests/data/filter/tmpl/enclosure_length.xml
+++ b/tests/data/filter/tmpl/enclosure_length.xml
@ -0,0 +1,11 @@
 <!--
 Description:  link relationship
 Expect:       Items[0]['enclosure_length'] == '100'
 -->
 <feed xmlns="http://www.w3.org/2005/Atom">
  <entry>
    <link rel="enclosure" length="100"/>
  </entry>
 </feed>
--- a/tests/data/filter/tmpl/enclosure_type.xml
+++ b/tests/data/filter/tmpl/enclosure_type.xml
@ -0,0 +1,11 @@
 <!--
 Description:  link relationship
 Expect:       Items[0]['enclosure_type'] == 'audio/mpeg'
 -->
 <feed xmlns="http://www.w3.org/2005/Atom">
  <entry>
    <link rel="enclosure" type="audio/mpeg"/>
  </entry>
 </feed>
--- a/tests/data/filter/translate.ini
+++ b/tests/data/filter/translate.ini
@ -0,0 +1,7 @@
 [Planet]
 filters = translate.xslt
 filter_directories = tests/data/filter
 [translate.xslt]
 in = aeiou
 out = AEIOU
--- a/tests/data/filter/translate.xslt
+++ b/tests/data/filter/translate.xslt
@ -0,0 +1,20 @@
 <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:param name="in"/>
  <xsl:param name="out"/>
  <!-- translate $in characters to $out in attribute values -->
  <xsl:template match="@*">
    <xsl:attribute name="{name()}">
      <xsl:value-of select="translate(.,$in,$out)"/>
    </xsl:attribute>
  </xsl:template>
  <!-- pass through everything else -->
  <xsl:template match="node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>
 </xsl:stylesheet>
--- a/tests/data/filter/xpath-sifter2.ini
+++ b/tests/data/filter/xpath-sifter2.ini
@ -0,0 +1,2 @@
 [Planet]
 filters = xpath_sifter.py?require=//atom%3Acategory%5B%40term%3D%27two%27%5D
--- a/tests/data/reconstitute.xslt
+++ b/tests/data/reconstitute.xslt
@ -0,0 +1,40 @@
 <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
                xmlns:atom="http://www.w3.org/2005/Atom"
                xmlns:planet="http://planet.intertwingly.net/"
                xmlns:xhtml="http://www.w3.org/1999/xhtml"
                xmlns="http://www.w3.org/1999/xhtml">
  <!-- indent atom and planet elements -->
  <xsl:template match="atom:*|planet:*">
    <!-- double space before atom:entries and planet:source -->
    <xsl:if test="self::atom:entry | self::planet:source">
      <xsl:text>&#10;</xsl:text>
    </xsl:if>
    <!-- indent start tag -->
    <xsl:text>&#10;</xsl:text>
    <xsl:for-each select="ancestor::*">
      <xsl:text>  </xsl:text>
    </xsl:for-each>
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
      <!-- indent end tag if there are element children -->
      <xsl:if test="*">
        <xsl:text>&#10;</xsl:text>
        <xsl:for-each select="ancestor::*">
          <xsl:text>  </xsl:text>
        </xsl:for-each>
      </xsl:if>
    </xsl:copy>
  </xsl:template>
  <!-- pass through everything else -->
  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>
 </xsl:stylesheet>
--- a/tests/data/reconstitute/enclosure.xml
+++ b/tests/data/reconstitute/enclosure.xml
@ -0,0 +1,13 @@
 <!--
 Description:  enclosure
 Expect:       links[0].rel == 'enclosure' and id == 'http://example.com/1'
 -->
 <rss>
  <channel>
    <item>
      <enclosure href="http://example.com/1"/>
    </item>
  </channel>
 </rss>
--- a/tests/data/reconstitute/feedburner_origlink.xml
+++ b/tests/data/reconstitute/feedburner_origlink.xml
@ -0,0 +1,12 @@
 <!--
 Description:  feedburner origlink relationship
 Expect:       feedburner_origlink == 'http://example.com/1'
 -->
 <feed xmlns="http://www.w3.org/2005/Atom"
  xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">
  <entry>
    <feedburner:origlink>http://example.com/1</feedburner:origlink>
  </entry>
 </feed>
--- a/tests/data/reconstitute/link_length.xml
+++ b/tests/data/reconstitute/link_length.xml
@ -0,0 +1,11 @@
 <!--
 Description:  link relationship
 Expect:       links[0].length == '4000000'
 -->
 <feed xmlns="http://www.w3.org/2005/Atom">
  <entry>
    <link rel="enclosure" href="http://example.com/music.mp3" length="4000000"/>
  </entry>
 </feed>
--- a/tests/data/reconstitute/missing_item_pubDate.xml
+++ b/tests/data/reconstitute/missing_item_pubDate.xml
@ -0,0 +1,14 @@
 <!--
 Description:  if item pubdate is missing, use to channel level date
 Expect:       updated_parsed == (2006, 6, 21, 13, 16, 41, 2, 172, 0)
 -->
 <rss version="0.91">
  <channel>
    <pubDate>Wed, 21 Jun 2006 14:16:41 +0100</pubDate>
    <item/>
  </channel>
 </rss>
--- a/tests/data/reconstitute/rss_image.xml
+++ b/tests/data/reconstitute/rss_image.xml
@ -0,0 +1,12 @@
 <!--
 Description:  logo
 Expect:       source.logo == 'http://example.com/logo.jpg'
 -->
 <rss version="2.0">
  <channel>
    <image><url>http://example.com/logo.jpg</url></image>
    <item/>
  </channel>
 </rss>
--- a/tests/data/reconstitute/rss_lang.xml
+++ b/tests/data/reconstitute/rss_lang.xml
@ -0,0 +1,14 @@
 <!--
 Description:  link relationship
 Expect:       title_detail.language == 'en'
 -->
 <rss version="2.0">
  <channel>
    <language>en</language>
    <item>
      <title>foo</title>
    </item>
  </channel>
 </rss>
--- a/tests/data/reconstitute/rss_source.xml
+++ b/tests/data/reconstitute/rss_source.xml
--- a/tests/data/splice/cache/example.com,3
+++ b/tests/data/splice/cache/example.com,3
@ -1,2 +1,2 @@
 <?xml version="1.0" encoding="utf-8"?>
-<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>http://example.com/3</id><link href="http://example.com/3" rel="alternate" type="text/html"/><title>Earth</title><summary>the Blue Planet</summary><updated planet:format="January 03, 2006 12:00 AM">2006-01-03T00:00:00Z</updated><source><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed3.rss" rel="alternate" type="text/html"/><link href="tests/data/spider/testfeed3.rss" rel="self" type="application/atom+xml"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><planet:name>three</planet:name><planet:http_status>200</planet:http_status></source></entry>
+<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>http://example.com/3</id><link href="http://example.com/3" rel="alternate" type="text/html"/><title>Earth</title><summary>the Blue Planet</summary><updated planet:format="January 03, 2006 12:00 AM">2006-01-03T00:00:00Z</updated><source><id>http://intertwingly.net/code/venus/tests/data/spider/testfeed3.rss</id><author><name>three</name></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed3.rss" rel="alternate" type="text/html"/><link href="tests/data/spider/testfeed3.rss" rel="self" type="application/atom+xml"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="October 14, 2006 01:02 PM">2006-10-14T13:02:18Z</updated><planet:format>rss20</planet:format><planet:name>three</planet:name><planet:bozo>true</planet:bozo><planet:http_status>200</planet:http_status></source></entry>
--- a/tests/data/splice/cache/example.com,4
+++ b/tests/data/splice/cache/example.com,4
@ -1,2 +1,2 @@
 <?xml version="1.0" encoding="utf-8"?>
-<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>http://example.com/4</id><link href="http://example.com/4" rel="alternate" type="text/html"/><title>Mars</title><summary>the Red Planet</summary><updated planet:format="August 25, 2006 01:41 PM">2006-08-25T13:41:22Z</updated><source><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed3.rss" rel="alternate" type="text/html"/><link href="tests/data/spider/testfeed3.rss" rel="self" type="application/atom+xml"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><planet:name>three</planet:name><planet:http_status>200</planet:http_status></source></entry>
+<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>http://example.com/4</id><link href="http://example.com/4" rel="alternate" type="text/html"/><title>Mars</title><summary>the Red Planet</summary><updated planet:format="October 14, 2006 01:02 PM">2006-10-14T13:02:18Z</updated><source><id>http://intertwingly.net/code/venus/tests/data/spider/testfeed3.rss</id><author><name>three</name></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed3.rss" rel="alternate" type="text/html"/><link href="tests/data/spider/testfeed3.rss" rel="self" type="application/atom+xml"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="October 14, 2006 01:02 PM">2006-10-14T13:02:18Z</updated><planet:format>rss20</planet:format><planet:name>three</planet:name><planet:bozo>true</planet:bozo><planet:http_status>200</planet:http_status></source></entry>
--- a/tests/data/splice/cache/planet.intertwingly.net,2006,testfeed1,1
+++ b/tests/data/splice/cache/planet.intertwingly.net,2006,testfeed1,1
@ -1,2 +1,2 @@
 <?xml version="1.0" encoding="utf-8"?>
-<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>tag:planet.intertwingly.net,2006:testfeed1/1</id><link href="http://example.com/1" rel="alternate" type="text/html"/><title>Mercury</title><content>Messenger of the Roman Gods</content><updated planet:format="January 01, 2006 12:00 AM">2006-01-01T00:00:00Z</updated><source><id>tag:planet.intertwingly.net,2006:testfeed1</id><author><name>Sam Ruby</name><email>rubys@intertwingly.net</email><uri>http://www.intertwingly.net/blog/</uri></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed1a.atom" rel="self" type="application/atom+xml"/><link href="http://www.intertwingly.net/blog/" rel="alternate" type="text/html"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="June 17, 2006 12:15 AM">2006-06-17T00:15:18Z</updated><planet:name>one</planet:name><planet:http_status>200</planet:http_status></source></entry>
+<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>tag:planet.intertwingly.net,2006:testfeed1/1</id><link href="http://example.com/1" rel="alternate" type="text/html"/><title>Mercury</title><content>Messenger of the Roman Gods</content><updated planet:format="January 01, 2006 12:00 AM">2006-01-01T00:00:00Z</updated><source><id>tag:planet.intertwingly.net,2006:testfeed1</id><author><name>Sam Ruby</name><email>rubys@intertwingly.net</email><uri>http://www.intertwingly.net/blog/</uri></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed1a.atom" rel="self" type="application/atom+xml"/><link href="http://www.intertwingly.net/blog/" rel="alternate" type="text/html"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="June 17, 2006 12:15 AM">2006-06-17T00:15:18Z</updated><planet:bozo>false</planet:bozo><planet:format>atom10</planet:format><planet:name>one</planet:name><planet:http_status>200</planet:http_status></source></entry>
--- a/tests/data/splice/cache/planet.intertwingly.net,2006,testfeed1,2
+++ b/tests/data/splice/cache/planet.intertwingly.net,2006,testfeed1,2
@ -1,2 +1,2 @@
 <?xml version="1.0" encoding="utf-8"?>
-<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>tag:planet.intertwingly.net,2006:testfeed1/2</id><link href="http://example.com/2" rel="alternate" type="text/html"/><title>Venus</title><content>the Jewel of the Sky</content><updated planet:format="February 02, 2006 12:00 AM">2006-02-02T00:00:00Z</updated><published planet:format="January 02, 2006 12:00 AM">2006-01-02T00:00:00Z</published><source><id>tag:planet.intertwingly.net,2006:testfeed1</id><author><name>Sam Ruby</name><email>rubys@intertwingly.net</email><uri>http://www.intertwingly.net/blog/</uri></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed1a.atom" rel="self" type="application/atom+xml"/><link href="http://www.intertwingly.net/blog/" rel="alternate" type="text/html"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="June 17, 2006 12:15 AM">2006-06-17T00:15:18Z</updated><planet:name>one</planet:name><planet:http_status>200</planet:http_status></source></entry>
+<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>tag:planet.intertwingly.net,2006:testfeed1/2</id><link href="http://example.com/2" rel="alternate" type="text/html"/><title>Venus</title><content>the Jewel of the Sky</content><updated planet:format="February 02, 2006 12:00 AM">2006-02-02T00:00:00Z</updated><published planet:format="January 02, 2006 12:00 AM">2006-01-02T00:00:00Z</published><source><id>tag:planet.intertwingly.net,2006:testfeed1</id><author><name>Sam Ruby</name><email>rubys@intertwingly.net</email><uri>http://www.intertwingly.net/blog/</uri></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed1a.atom" rel="self" type="application/atom+xml"/><link href="http://www.intertwingly.net/blog/" rel="alternate" type="text/html"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="June 17, 2006 12:15 AM">2006-06-17T00:15:18Z</updated><planet:bozo>false</planet:bozo><planet:format>atom10</planet:format><planet:name>one</planet:name><planet:http_status>200</planet:http_status></source></entry>
--- a/tests/data/splice/cache/planet.intertwingly.net,2006,testfeed1,3
+++ b/tests/data/splice/cache/planet.intertwingly.net,2006,testfeed1,3
@ -1,2 +1,2 @@
 <?xml version="1.0" encoding="utf-8"?>
-<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>tag:planet.intertwingly.net,2006:testfeed1/3</id><link href="http://example.com/3" rel="alternate" type="text/html"/><title>Earth</title><content>the Blue Planet</content><updated planet:format="January 03, 2006 12:00 AM">2006-01-03T00:00:00Z</updated><source><id>tag:planet.intertwingly.net,2006:testfeed1</id><author><name>Sam Ruby</name><email>rubys@intertwingly.net</email><uri>http://www.intertwingly.net/blog/</uri></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed1a.atom" rel="self" type="application/atom+xml"/><link href="http://www.intertwingly.net/blog/" rel="alternate" type="text/html"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="June 17, 2006 12:15 AM">2006-06-17T00:15:18Z</updated><planet:name>one</planet:name><planet:http_status>200</planet:http_status></source></entry>
+<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>tag:planet.intertwingly.net,2006:testfeed1/3</id><link href="http://example.com/3" rel="alternate" type="text/html"/><title>Earth</title><content>the Blue Planet</content><updated planet:format="January 03, 2006 12:00 AM">2006-01-03T00:00:00Z</updated><source><id>tag:planet.intertwingly.net,2006:testfeed1</id><author><name>Sam Ruby</name><email>rubys@intertwingly.net</email><uri>http://www.intertwingly.net/blog/</uri></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed1a.atom" rel="self" type="application/atom+xml"/><link href="http://www.intertwingly.net/blog/" rel="alternate" type="text/html"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="June 17, 2006 12:15 AM">2006-06-17T00:15:18Z</updated><planet:bozo>false</planet:bozo><planet:format>atom10</planet:format><planet:name>one</planet:name><planet:http_status>200</planet:http_status></source></entry>
--- a/tests/data/splice/cache/planet.intertwingly.net,2006,testfeed1,4
+++ b/tests/data/splice/cache/planet.intertwingly.net,2006,testfeed1,4
@ -1,2 +1,2 @@
 <?xml version="1.0" encoding="utf-8"?>
-<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>tag:planet.intertwingly.net,2006:testfeed1/4</id><link href="http://example.com/4" rel="alternate" type="text/html"/><title>Mars</title><content>the Red Planet</content><updated planet:format="January 04, 2006 12:00 AM">2006-01-04T00:00:00Z</updated><source><id>tag:planet.intertwingly.net,2006:testfeed1</id><author><name>Sam Ruby</name><email>rubys@intertwingly.net</email><uri>http://www.intertwingly.net/blog/</uri></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed1a.atom" rel="self" type="application/atom+xml"/><link href="http://www.intertwingly.net/blog/" rel="alternate" type="text/html"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="June 17, 2006 12:15 AM">2006-06-17T00:15:18Z</updated><planet:name>one</planet:name><planet:http_status>200</planet:http_status></source></entry>
+<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>tag:planet.intertwingly.net,2006:testfeed1/4</id><link href="http://example.com/4" rel="alternate" type="text/html"/><title>Mars</title><content>the Red Planet</content><updated planet:format="January 04, 2006 12:00 AM">2006-01-04T00:00:00Z</updated><source><id>tag:planet.intertwingly.net,2006:testfeed1</id><author><name>Sam Ruby</name><email>rubys@intertwingly.net</email><uri>http://www.intertwingly.net/blog/</uri></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed1a.atom" rel="self" type="application/atom+xml"/><link href="http://www.intertwingly.net/blog/" rel="alternate" type="text/html"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="June 17, 2006 12:15 AM">2006-06-17T00:15:18Z</updated><planet:bozo>false</planet:bozo><planet:format>atom10</planet:format><planet:name>one</planet:name><planet:http_status>200</planet:http_status></source></entry>
--- a/tests/data/splice/cache/planet.intertwingly.net,2006,testfeed2,1
+++ b/tests/data/splice/cache/planet.intertwingly.net,2006,testfeed2,1
@ -1,2 +1,2 @@
 <?xml version="1.0" encoding="utf-8"?>
-<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>tag:planet.intertwingly.net,2006:testfeed2/1</id><link href="http://example.com/1" rel="alternate" type="text/html"/><title xml:lang="en-us">Mercury</title><content xml:lang="en-us">Messenger of the Roman Gods</content><updated planet:format="January 01, 2006 12:00 AM">2006-01-01T00:00:00Z</updated><source><id>tag:planet.intertwingly.net,2006:testfeed2</id><author><name>Sam Ruby</name><email>rubys@intertwingly.net</email><uri>http://www.intertwingly.net/blog/</uri></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed2.atom" rel="self" type="application/atom+xml"/><link href="http://www.intertwingly.net/blog/" rel="alternate" type="text/html"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="June 17, 2006 12:15 AM">2006-06-17T00:15:18Z</updated><planet:name>two</planet:name><planet:http_status>200</planet:http_status></source></entry>
+<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>tag:planet.intertwingly.net,2006:testfeed2/1</id><link href="http://example.com/1" rel="alternate" type="text/html"/><title xml:lang="en-us">Mercury</title><content xml:lang="en-us">Messenger of the Roman Gods</content><updated planet:format="January 01, 2006 12:00 AM">2006-01-01T00:00:00Z</updated><source><id>tag:planet.intertwingly.net,2006:testfeed2</id><author><name>Sam Ruby</name><email>rubys@intertwingly.net</email><uri>http://www.intertwingly.net/blog/</uri></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed2.atom" rel="self" type="application/atom+xml"/><link href="http://www.intertwingly.net/blog/" rel="alternate" type="text/html"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="June 17, 2006 12:15 AM">2006-06-17T00:15:18Z</updated><planet:bozo>false</planet:bozo><planet:format>atom10</planet:format><planet:name>two</planet:name><planet:http_status>200</planet:http_status></source></entry>
--- a/tests/data/splice/cache/planet.intertwingly.net,2006,testfeed2,2
+++ b/tests/data/splice/cache/planet.intertwingly.net,2006,testfeed2,2
@ -1,2 +1,2 @@
 <?xml version="1.0" encoding="utf-8"?>
-<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>tag:planet.intertwingly.net,2006:testfeed2/2</id><link href="http://example.com/2" rel="alternate" type="text/html"/><title xml:lang="en-us">Venus</title><content xml:lang="en-us">the Morning Star</content><updated planet:format="January 02, 2006 12:00 AM">2006-01-02T00:00:00Z</updated><source><id>tag:planet.intertwingly.net,2006:testfeed2</id><author><name>Sam Ruby</name><email>rubys@intertwingly.net</email><uri>http://www.intertwingly.net/blog/</uri></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed2.atom" rel="self" type="application/atom+xml"/><link href="http://www.intertwingly.net/blog/" rel="alternate" type="text/html"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="June 17, 2006 12:15 AM">2006-06-17T00:15:18Z</updated><planet:name>two</planet:name><planet:http_status>200</planet:http_status></source></entry>
+<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>tag:planet.intertwingly.net,2006:testfeed2/2</id><link href="http://example.com/2" rel="alternate" type="text/html"/><title xml:lang="en-us">Venus</title><content xml:lang="en-us">the Morning Star</content><updated planet:format="January 02, 2006 12:00 AM">2006-01-02T00:00:00Z</updated><source><id>tag:planet.intertwingly.net,2006:testfeed2</id><author><name>Sam Ruby</name><email>rubys@intertwingly.net</email><uri>http://www.intertwingly.net/blog/</uri></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed2.atom" rel="self" type="application/atom+xml"/><link href="http://www.intertwingly.net/blog/" rel="alternate" type="text/html"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="June 17, 2006 12:15 AM">2006-06-17T00:15:18Z</updated><planet:bozo>false</planet:bozo><planet:format>atom10</planet:format><planet:name>two</planet:name><planet:http_status>200</planet:http_status></source></entry>
--- a/tests/data/splice/cache/planet.intertwingly.net,2006,testfeed2,3
+++ b/tests/data/splice/cache/planet.intertwingly.net,2006,testfeed2,3
@ -1,2 +1,2 @@
 <?xml version="1.0" encoding="utf-8"?>
-<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>tag:planet.intertwingly.net,2006:testfeed2/3</id><link href="http://example.com/3" rel="alternate" type="text/html"/><title>Earth</title><content xml:lang="en-us">the Blue Planet</content><updated planet:format="January 03, 2006 12:00 AM">2006-01-03T00:00:00Z</updated><source><id>tag:planet.intertwingly.net,2006:testfeed2</id><author><name>Sam Ruby</name><email>rubys@intertwingly.net</email><uri>http://www.intertwingly.net/blog/</uri></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed2.atom" rel="self" type="application/atom+xml"/><link href="http://www.intertwingly.net/blog/" rel="alternate" type="text/html"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="June 17, 2006 12:15 AM">2006-06-17T00:15:18Z</updated><planet:name>two</planet:name><planet:http_status>200</planet:http_status></source></entry>
+<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>tag:planet.intertwingly.net,2006:testfeed2/3</id><link href="http://example.com/3" rel="alternate" type="text/html"/><title>Earth</title><content xml:lang="en-us">the Blue Planet</content><updated planet:format="January 03, 2006 12:00 AM">2006-01-03T00:00:00Z</updated><source><id>tag:planet.intertwingly.net,2006:testfeed2</id><author><name>Sam Ruby</name><email>rubys@intertwingly.net</email><uri>http://www.intertwingly.net/blog/</uri></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed2.atom" rel="self" type="application/atom+xml"/><link href="http://www.intertwingly.net/blog/" rel="alternate" type="text/html"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="June 17, 2006 12:15 AM">2006-06-17T00:15:18Z</updated><planet:bozo>false</planet:bozo><planet:format>atom10</planet:format><planet:name>two</planet:name><planet:http_status>200</planet:http_status></source></entry>
--- a/tests/data/splice/cache/planet.intertwingly.net,2006,testfeed2,4
+++ b/tests/data/splice/cache/planet.intertwingly.net,2006,testfeed2,4
@ -1,2 +1,2 @@
 <?xml version="1.0" encoding="utf-8"?>
-<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>tag:planet.intertwingly.net,2006:testfeed2/4</id><link href="http://example.com/4" rel="alternate" type="text/html"/><title>Mars</title><content>the Red Planet</content><updated planet:format="January 04, 2006 12:00 AM">2006-01-04T00:00:00Z</updated><source><id>tag:planet.intertwingly.net,2006:testfeed2</id><author><name>Sam Ruby</name><email>rubys@intertwingly.net</email><uri>http://www.intertwingly.net/blog/</uri></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed2.atom" rel="self" type="application/atom+xml"/><link href="http://www.intertwingly.net/blog/" rel="alternate" type="text/html"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="June 17, 2006 12:15 AM">2006-06-17T00:15:18Z</updated><planet:name>two</planet:name><planet:http_status>200</planet:http_status></source></entry>
+<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>tag:planet.intertwingly.net,2006:testfeed2/4</id><link href="http://example.com/4" rel="alternate" type="text/html"/><title>Mars</title><content>the Red Planet</content><updated planet:format="January 04, 2006 12:00 AM">2006-01-04T00:00:00Z</updated><source><id>tag:planet.intertwingly.net,2006:testfeed2</id><author><name>Sam Ruby</name><email>rubys@intertwingly.net</email><uri>http://www.intertwingly.net/blog/</uri></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed2.atom" rel="self" type="application/atom+xml"/><link href="http://www.intertwingly.net/blog/" rel="alternate" type="text/html"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="June 17, 2006 12:15 AM">2006-06-17T00:15:18Z</updated><planet:bozo>false</planet:bozo><planet:format>atom10</planet:format><planet:name>two</planet:name><planet:http_status>200</planet:http_status></source></entry>
--- a/tests/data/splice/cache/planet.intertwingly.net,2006,testfeed3,1
+++ b/tests/data/splice/cache/planet.intertwingly.net,2006,testfeed3,1
@ -1,2 +1,2 @@
 <?xml version="1.0" encoding="utf-8"?>
-<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>tag:planet.intertwingly.net,2006:testfeed3/1</id><link href="http://example.com/1" rel="alternate" type="text/html"/><title>Mercury</title><summary>Messenger of the Roman Gods</summary><updated planet:format="January 01, 2006 12:00 AM">2006-01-01T00:00:00Z</updated><source><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed3.rss" rel="alternate" type="text/html"/><link href="tests/data/spider/testfeed3.rss" rel="self" type="application/atom+xml"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><planet:name>three</planet:name><planet:http_status>200</planet:http_status></source></entry>
+<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>tag:planet.intertwingly.net,2006:testfeed3/1</id><link href="http://example.com/1" rel="alternate" type="text/html"/><title>Mercury</title><summary>Messenger of the Roman Gods</summary><updated planet:format="January 01, 2006 12:00 AM">2006-01-01T00:00:00Z</updated><source><id>http://intertwingly.net/code/venus/tests/data/spider/testfeed3.rss</id><author><name>three</name></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed3.rss" rel="alternate" type="text/html"/><link href="tests/data/spider/testfeed3.rss" rel="self" type="application/atom+xml"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="October 14, 2006 01:02 PM">2006-10-14T13:02:18Z</updated><planet:format>rss20</planet:format><planet:name>three</planet:name><planet:bozo>true</planet:bozo><planet:http_status>200</planet:http_status></source></entry>
--- a/tests/data/splice/cache/planet.intertwingly.net,2006,testfeed3,2
+++ b/tests/data/splice/cache/planet.intertwingly.net,2006,testfeed3,2
@ -1,2 +1,2 @@
 <?xml version="1.0" encoding="utf-8"?>
-<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>tag:planet.intertwingly.net,2006:testfeed3/2</id><link href="http://example.com/2" rel="alternate" type="text/html"/><title>Venus</title><summary>the Morning Star</summary><updated planet:format="August 25, 2006 01:41 PM">2006-08-25T13:41:22Z</updated><source><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed3.rss" rel="alternate" type="text/html"/><link href="tests/data/spider/testfeed3.rss" rel="self" type="application/atom+xml"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><planet:name>three</planet:name><planet:http_status>200</planet:http_status></source></entry>
+<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>tag:planet.intertwingly.net,2006:testfeed3/2</id><link href="http://example.com/2" rel="alternate" type="text/html"/><title>Venus</title><summary>the Morning Star</summary><updated planet:format="October 14, 2006 01:02 PM">2006-10-14T13:02:18Z</updated><source><id>http://intertwingly.net/code/venus/tests/data/spider/testfeed3.rss</id><author><name>three</name></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed3.rss" rel="alternate" type="text/html"/><link href="tests/data/spider/testfeed3.rss" rel="self" type="application/atom+xml"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="October 14, 2006 01:02 PM">2006-10-14T13:02:18Z</updated><planet:format>rss20</planet:format><planet:name>three</planet:name><planet:bozo>true</planet:bozo><planet:http_status>200</planet:http_status></source></entry>
--- a/tests/data/splice/cache/sources/tests,data,spider,testfeed0.atom
+++ b/tests/data/splice/cache/sources/tests,data,spider,testfeed0.atom
@ -1,2 +1,2 @@
 <?xml version="1.0" encoding="utf-8"?>
-<feed xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><link href="tests/data/spider/testfeed0.atom" rel="self" type="application/atom+xml"/><planet:name>not found</planet:name><planet:http_status>500</planet:http_status></feed>
+<feed xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><author><name>not found</name></author><link href="tests/data/spider/testfeed0.atom" rel="self" type="application/atom+xml"/><updated planet:format="October 14, 2006 01:02 PM">2006-10-14T13:02:18Z</updated><planet:message>internal server error</planet:message><planet:bozo>true</planet:bozo><planet:http_status>500</planet:http_status><planet:name>not found</planet:name></feed>
--- a/tests/data/splice/cache/sources/tests,data,spider,testfeed1b.atom
+++ b/tests/data/splice/cache/sources/tests,data,spider,testfeed1b.atom
@ -1,2 +1,2 @@
 <?xml version="1.0" encoding="utf-8"?>
-<feed xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>tag:planet.intertwingly.net,2006:testfeed1</id><author><name>Sam Ruby</name><email>rubys@intertwingly.net</email><uri>http://www.intertwingly.net/blog/</uri></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed1a.atom" rel="self" type="application/atom+xml"/><link href="http://www.intertwingly.net/blog/" rel="alternate" type="text/html"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="June 17, 2006 12:15 AM">2006-06-17T00:15:18Z</updated><planet:name>one</planet:name><planet:http_status>200</planet:http_status></feed>
+<feed xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>tag:planet.intertwingly.net,2006:testfeed1</id><author><name>Sam Ruby</name><email>rubys@intertwingly.net</email><uri>http://www.intertwingly.net/blog/</uri></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed1a.atom" rel="self" type="application/atom+xml"/><link href="http://www.intertwingly.net/blog/" rel="alternate" type="text/html"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="June 17, 2006 12:15 AM">2006-06-17T00:15:18Z</updated><planet:bozo>false</planet:bozo><planet:format>atom10</planet:format><planet:name>one</planet:name><planet:http_status>200</planet:http_status></feed>
--- a/tests/data/splice/cache/sources/tests,data,spider,testfeed2.atom
+++ b/tests/data/splice/cache/sources/tests,data,spider,testfeed2.atom
@ -1,2 +1,2 @@
 <?xml version="1.0" encoding="utf-8"?>
-<feed xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>tag:planet.intertwingly.net,2006:testfeed2</id><author><name>Sam Ruby</name><email>rubys@intertwingly.net</email><uri>http://www.intertwingly.net/blog/</uri></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed2.atom" rel="self" type="application/atom+xml"/><link href="http://www.intertwingly.net/blog/" rel="alternate" type="text/html"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="June 17, 2006 12:15 AM">2006-06-17T00:15:18Z</updated><planet:name>two</planet:name><planet:http_status>200</planet:http_status></feed>
+<feed xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>tag:planet.intertwingly.net,2006:testfeed2</id><author><name>Sam Ruby</name><email>rubys@intertwingly.net</email><uri>http://www.intertwingly.net/blog/</uri></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed2.atom" rel="self" type="application/atom+xml"/><link href="http://www.intertwingly.net/blog/" rel="alternate" type="text/html"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="June 17, 2006 12:15 AM">2006-06-17T00:15:18Z</updated><planet:bozo>false</planet:bozo><planet:format>atom10</planet:format><planet:name>two</planet:name><planet:http_status>200</planet:http_status></feed>
--- a/tests/data/splice/cache/sources/tests,data,spider,testfeed3.rss
+++ b/tests/data/splice/cache/sources/tests,data,spider,testfeed3.rss
@ -1,2 +1,2 @@
 <?xml version="1.0" encoding="utf-8"?>
-<feed xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed3.rss" rel="alternate" type="text/html"/><link href="tests/data/spider/testfeed3.rss" rel="self" type="application/atom+xml"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><planet:name>three</planet:name><planet:http_status>200</planet:http_status></feed>
+<feed xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>http://intertwingly.net/code/venus/tests/data/spider/testfeed3.rss</id><author><name>three</name></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed3.rss" rel="alternate" type="text/html"/><link href="tests/data/spider/testfeed3.rss" rel="self" type="application/atom+xml"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="October 14, 2006 01:02 PM">2006-10-14T13:02:18Z</updated><planet:format>rss20</planet:format><planet:name>three</planet:name><planet:bozo>true</planet:bozo><planet:http_status>200</planet:http_status></feed>
--- a/tests/reconstitute.py
+++ b/tests/reconstitute.py
@ -0,0 +1,87 @@
 #!/usr/bin/env python
 import os, sys, ConfigParser, shutil, glob
 venus_base = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
 sys.path.insert(0,venus_base)
 if __name__ == "__main__":
    hide_planet_ns = True
    while len(sys.argv) > 1:
        if sys.argv[1] == '-v' or sys.argv[1] == '--verbose':
            import planet
            planet.getLogger('DEBUG',None)
            del sys.argv[1]
        elif sys.argv[1] == '-p' or sys.argv[1] == '--planet':
            hide_planet_ns = False
            del sys.argv[1]
        else:
            break
    parser = ConfigParser.ConfigParser()
    parser.add_section('Planet')
    parser.add_section(sys.argv[1])
    work = reduce(os.path.join, ['tests','work','reconsititute'], venus_base)
    output = os.path.join(work, 'output')
    filters = os.path.join(venus_base,'filters')
    parser.set('Planet','cache_directory',work)
    parser.set('Planet','output_dir',output)
    parser.set('Planet','filter_directories',filters)
    if hide_planet_ns:
        parser.set('Planet','template_files','themes/common/atom.xml.xslt')
    else:
        parser.set('Planet','template_files','tests/data/reconstitute.xslt')
    for name, value in zip(sys.argv[2::2],sys.argv[3::2]):
        parser.set(sys.argv[1], name.lstrip('-'), value)
    from planet import config
    config.parser = parser
    from planet import spider
    spider.spiderPlanet(only_if_new=False)
    from planet import feedparser
    for source in glob.glob(os.path.join(work, 'sources/*')):
        feed = feedparser.parse(source).feed
        if feed.has_key('title'):
            config.parser.set('Planet','name',feed.title_detail.value)
        if feed.has_key('link'):
            config.parser.set('Planet','link',feed.link)
        if feed.has_key('author_detail'):
            if feed.author_detail.has_key('name'):
                config.parser.set('Planet','owner_name',feed.author_detail.name)
            if feed.author_detail.has_key('email'):
                config.parser.set('Planet','owner_email',feed.author_detail.email)
    from planet import splice
    doc = splice.splice()
    sources = doc.getElementsByTagName('planet:source')
    if hide_planet_ns and len(sources) == 1:
        source = sources[0]
        feed = source.parentNode
        child = feed.firstChild
        while child:
            next = child.nextSibling
            if child.nodeName not in ['planet:source','entry']:
                feed.removeChild(child)
            child = next
        while source.hasChildNodes():
            child = source.firstChild
            source.removeChild(child)
            feed.insertBefore(child, source)
        for source in doc.getElementsByTagName('source'):
            source.parentNode.removeChild(source)
    splice.apply(doc.toxml('utf-8'))
    if hide_planet_ns:
        atom = open(os.path.join(output,'atom.xml')).read()
    else:
        atom = open(os.path.join(output,'reconstitute')).read()
    shutil.rmtree(work)
    os.removedirs(os.path.dirname(work))
    print atom
--- a/tests/test_filter_xslt.py
+++ b/tests/test_filter_xslt.py
@ -0,0 +1,28 @@
 #!/usr/bin/env python
 import unittest, xml.dom.minidom
 from planet import shell, config, logger
 class XsltFilterTests(unittest.TestCase):
    def test_xslt_filter(self):
        config.load('tests/data/filter/translate.ini')
        testfile = 'tests/data/filter/category-one.xml'
        input = open(testfile).read()
        output = shell.run(config.filters()[0], input, mode="filter")
        dom = xml.dom.minidom.parseString(output)
        catterm = dom.getElementsByTagName('category')[0].getAttribute('term')
        self.assertEqual('OnE', catterm)
 try:
    import libxslt
 except:
    try:
        from subprocess import Popen, PIPE
        xsltproc=Popen(['xsltproc','--version'],stdout=PIPE,stderr=PIPE)
        xsltproc.communicate()
        if xsltproc.returncode != 0: raise ImportError
    except:
        logger.warn("libxslt is not available => can't test xslt filters")
        del XsltFilterTests.test_xslt_filter
--- a/tests/test_filters.py
+++ b/tests/test_filters.py
@ -14,10 +14,16 @@ class FilterTests(unittest.TestCase):
        imgsrc = dom.getElementsByTagName('img')[0].getAttribute('src')
        self.assertEqual('http://example.com.nyud.net:8080/foo.png', imgsrc)
-    def test_excerpt_images(self):
+    def test_excerpt_images1(self):
        testfile = 'tests/data/filter/excerpt-images.xml'
        config.load('tests/data/filter/excerpt-images.ini')
        self.verify_images()
    def test_excerpt_images2(self):
        config.load('tests/data/filter/excerpt-images2.ini')
        self.verify_images()
    def verify_images(self):
        testfile = 'tests/data/filter/excerpt-images.xml'
        output = open(testfile).read()
        for filter in config.filters():
            output = shell.run(filter, output, mode="filter")
@ -58,8 +64,15 @@ class FilterTests(unittest.TestCase):
        self.assertEqual(u'before--after',
            excerpt.firstChild.firstChild.nodeValue)
-    def test_xpath_filter(self):
+    def test_xpath_filter1(self):
        config.load('tests/data/filter/xpath-sifter.ini')
        self.verify_xpath()
    def test_xpath_filter2(self):
        config.load('tests/data/filter/xpath-sifter2.ini')
        self.verify_xpath()
    def verify_xpath(self):
        testfile = 'tests/data/filter/category-one.xml'
        output = open(testfile).read()
@ -89,9 +102,10 @@ try:
        import libxml2
    except:
        logger.warn("libxml2 is not available => can't test xpath_sifter")
-        del FilterTests.test_xpath_filter
+        del FilterTests.test_xpath_filter1
        del FilterTests.test_xpath_filter2
 except ImportError:
-    logger.warn("Popen is not available => can't test filters")
+    logger.warn("Popen is not available => can't test standard filters")
    for method in dir(FilterTests):
        if method.startswith('test_'):  delattr(FilterTests,method)
--- a/tests/test_idindex.py
+++ b/tests/test_idindex.py
@ -0,0 +1,74 @@
 #!/usr/bin/env python
 import unittest
 from planet import idindex, config, logger
 class idIndexTest(unittest.TestCase):
    def setUp(self):
        # silence errors
        import planet
        planet.logger = None
        planet.getLogger('CRITICAL',None)
    def tearDown(self):
        idindex.destroy()
    def test_unicode(self):
        from planet.spider import filename
        index = idindex.create()
        iri = 'http://www.\xe8\xa9\xb9\xe5\xa7\x86\xe6\x96\xaf.com/'
        index[filename('', iri)] = 'data'
        index[filename('', iri.decode('utf-8'))] = 'data'
        index[filename('', u'1234')] = 'data'
        index.close()
    def test_index_spider(self):
        import test_spider
        config.load(test_spider.configfile)
        index = idindex.create()
        self.assertEqual(0, len(index))
        index.close()
        from planet.spider import spiderPlanet
        try:
            spiderPlanet()
            index = idindex.open()
            self.assertEqual(12, len(index))
            self.assertEqual('tag:planet.intertwingly.net,2006:testfeed1', index['planet.intertwingly.net,2006,testfeed1,1'])
            self.assertEqual('http://intertwingly.net/code/venus/tests/data/spider/testfeed3.rss', index['planet.intertwingly.net,2006,testfeed3,1'])
            index.close()
        finally:
            import os, shutil
            shutil.rmtree(test_spider.workdir)
            os.removedirs(os.path.split(test_spider.workdir)[0])
    def test_index_splice(self):
        import test_splice
        config.load(test_splice.configfile)
        index = idindex.create()
        self.assertEqual(12, len(index))
        self.assertEqual('tag:planet.intertwingly.net,2006:testfeed1', index['planet.intertwingly.net,2006,testfeed1,1'])
        self.assertEqual('http://intertwingly.net/code/venus/tests/data/spider/testfeed3.rss', index['planet.intertwingly.net,2006,testfeed3,1'])
        for key in index.keys():
            value = index[key]
            if value.find('testfeed2')>0: index[key] = value.swapcase()
        index.close()
        from planet.splice import splice
        doc = splice()
        self.assertEqual(8,len(doc.getElementsByTagName('entry')))
        self.assertEqual(4,len(doc.getElementsByTagName('planet:source')))
        self.assertEqual(12,len(doc.getElementsByTagName('planet:name')))
 try:
    module = 'dbhash'
 except ImportError:
    logger.warn("dbhash is not available => can't test id index")
    for method in dir(idIndexTest):
        if method.startswith('test_'):  delattr(idIndexTest,method)
--- a/tests/test_opml.py
+++ b/tests/test_opml.py
@ -76,6 +76,14 @@ class OpmlTest(unittest.TestCase):
                                text="sample feed"/>''', self.config)
        self.assertFalse(self.config.has_section("http://example.com/feed.xml"))
    def test_WordPress_link_manager(self):
        # http://www.wasab.dk/morten/blog/archives/2006/10/22/wp-venus
        opml2config('''<outline type="link"
                                xmlUrl="http://example.com/feed.xml"
                                text="sample feed"/>''', self.config)
        self.assertEqual('sample feed',
           self.config.get("http://example.com/feed.xml", 'name'))
    #
    # xmlUrl
    #
--- a/tests/test_scrub.py
+++ b/tests/test_scrub.py
@ -7,7 +7,7 @@ from planet import feedparser, config
 feed = '''
 <feed xmlns='http://www.w3.org/2005/Atom'>
  <author><name>F&amp;ouml;o</name></author>
-  <entry>
+  <entry xml:lang="en">
    <id>ignoreme</id>
    <author><name>F&amp;ouml;o</name></author>
    <updated>2000-01-01T00:00:00Z</updated>
@ -23,7 +23,7 @@ feed = '''
 configData = '''
 [testfeed]
-ignore_in_feed = id updated
+ignore_in_feed = id updated xml:lang
 name_type = html
 title_type = html
 summary_type = html
@ -40,12 +40,14 @@ class ScrubTest(unittest.TestCase):
        self.assertTrue(data.entries[0].has_key('id'))
        self.assertTrue(data.entries[0].has_key('updated'))
        self.assertTrue(data.entries[0].has_key('updated_parsed'))
        self.assertTrue(data.entries[0].summary_detail.has_key('language'))
        scrub('testfeed', data)
        self.assertFalse(data.entries[0].has_key('id'))
        self.assertFalse(data.entries[0].has_key('updated'))
        self.assertFalse(data.entries[0].has_key('updated_parsed'))
        self.assertFalse(data.entries[0].summary_detail.has_key('language'))
        self.assertEqual('F\xc3\xb6o', data.feed.author_detail.name)
        self.assertEqual('F\xc3\xb6o', data.entries[0].author_detail.name)
--- a/tests/test_spider.py
+++ b/tests/test_spider.py
@ -13,7 +13,7 @@ class SpiderTest(unittest.TestCase):
    def setUp(self):
        # silence errors
        planet.logger = None
-        planet.getLogger('CRITICAL')
+        planet.getLogger('CRITICAL',None)
        try:
             os.makedirs(workdir)
@ -58,6 +58,8 @@ class SpiderTest(unittest.TestCase):
        # verify that the file timestamps match atom:updated
        data = feedparser.parse(files[2])
        self.assertEqual(['application/atom+xml'], [link.type
            for link in data.entries[0].source.links if link.rel=='self'])
        self.assertEqual('one', data.entries[0].source.planet_name)
        self.assertEqual(os.stat(files[2]).st_mtime,
            calendar.timegm(data.entries[0].updated_parsed))
@ -82,5 +84,7 @@ class SpiderTest(unittest.TestCase):
        data = feedparser.parse(workdir + 
            '/planet.intertwingly.net,2006,testfeed3,1')
        self.assertEqual(['application/rss+xml'], [link.type
            for link in data.entries[0].source.links if link.rel=='self'])
        self.assertEqual('three', data.entries[0].source.author_detail.name)
--- a/tests/test_themes.py
+++ b/tests/test_themes.py
@ -4,7 +4,7 @@ import unittest
 from planet import config
 from os.path import split
-class ConfigTest(unittest.TestCase):
+class ThemesTest(unittest.TestCase):
    def setUp(self):
        config.load('tests/data/config/themed.ini')
@ -17,7 +17,8 @@ class ConfigTest(unittest.TestCase):
    # administrivia
    def test_template(self):
-        self.assertTrue('index.html.xslt' in config.template_files())
+        self.assertEqual(1, len([1 for file in config.template_files()
            if file == 'index.html.xslt']))
    def test_feeds(self):
        feeds = config.subscriptions()
--- a/themes/asf/config.ini
+++ b/themes/asf/config.ini
@ -7,6 +7,7 @@ template_files:
  foafroll.xml.xslt
  index.html.xslt
  opml.xml.xslt
  validate.html.xslt
 template_directories:
  ../common
--- a/themes/asf/index.html.xslt
+++ b/themes/asf/index.html.xslt
@ -56,6 +56,7 @@
                  </xsl:choose>
                  <img src="images/feed-icon-10x10.png" alt="(feed)"/>
                </a>
                <xsl:text> </xsl:text>
                <!-- name -->
                <a href="{atom:link[@rel='alternate']/@href}">
@ -153,7 +154,9 @@
          <img src="{atom:source/atom:icon}" class="icon"/>
        </xsl:if>
        <a href="{atom:source/atom:link[@rel='alternate']/@href}">
-          <xsl:attribute name="title" select="{atom:source/atom:title}"/>
+          <xsl:attribute name="title">
            <xsl:value-of select="atom:source/atom:title"/>
          </xsl:attribute>
          <xsl:value-of select="atom:source/planet:name"/>
        </a>
        <xsl:if test="string-length(atom:title) &gt; 0">
@ -236,6 +239,9 @@
  <!-- Feedburner detritus -->
  <xsl:template match="xhtml:div[@class='feedflare']"/>
  <!-- Strip site meter -->
  <xsl:template match="xhtml:div[comment()[. = ' Site Meter ']]"/>
  <!-- pass through everything else -->
  <xsl:template match="@*|node()">
    <xsl:copy>
--- a/themes/common/atom.xml.xslt
+++ b/themes/common/atom.xml.xslt
@ -14,14 +14,18 @@
  <xsl:template match="atom:link[@rel='service.post']"/>
  <xsl:template match="atom:link[@rel='service.feed']"/>
-   <!-- Feedburner detritus -->
+  <!-- Feedburner detritus -->
-   <xsl:template match="xhtml:div[@class='feedflare']"/>
+  <xsl:template match="xhtml:div[@class='feedflare']"/>
  <!-- Strip site meter -->
  <xsl:template match="xhtml:div[comment()[. = ' Site Meter ']]"/>
  <!-- add Google/LiveJournal-esque noindex directive -->
  <xsl:template match="atom:feed">
    <xsl:copy>
      <xsl:attribute name="indexing:index">no</xsl:attribute>
      <xsl:apply-templates select="@*|node()"/>
      <xsl:text>&#10;</xsl:text>
    </xsl:copy>
  </xsl:template>
--- a/themes/common/rss20.xml.tmpl
+++ b/themes/common/rss20.xml.tmpl
@ -10,7 +10,7 @@
 <TMPL_LOOP Items>
 <item>
 	<title><TMPL_VAR channel_name ESCAPE="HTML"><TMPL_IF title>: <TMPL_VAR title_plain ESCAPE="HTML"></TMPL_IF></title>
-	<guid><TMPL_VAR id ESCAPE="HTML"></guid>
+	<guid isPermaLink="<TMPL_VAR guid_isPermaLink>"><TMPL_VAR id ESCAPE="HTML"></guid>
 	<link><TMPL_VAR link ESCAPE="HTML"></link>
 	<TMPL_IF content>
 	<description><TMPL_VAR content ESCAPE="HTML"></description>
@ -23,6 +23,9 @@
 	<author><TMPL_VAR author_email></author>
 	</TMPL_IF>
 	</TMPL_IF>
 	<TMPL_IF enclosure_href>
        <enclosure url="<TMPL_VAR enclosure_href ESCAPE="HTML">" length="<TMPL_VAR enclosure_length>" type="<TMPL_VAR enclosure_type>"/>
 	</TMPL_IF>
 </item>
 </TMPL_LOOP>
--- a/themes/common/validate.html.xslt
+++ b/themes/common/validate.html.xslt
@ -0,0 +1,146 @@
 <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
                xmlns:atom="http://www.w3.org/2005/Atom"
                xmlns:xhtml="http://www.w3.org/1999/xhtml"
                xmlns:planet="http://planet.intertwingly.net/"
                xmlns="http://www.w3.org/1999/xhtml">
  <xsl:template match="atom:feed">
    <html xmlns="http://www.w3.org/1999/xhtml">
      <!-- head -->
      <xsl:text>&#10;&#10;</xsl:text>
      <head>
        <title><xsl:value-of select="atom:title"/></title>
        <meta name="robots" content="noindex,nofollow" />
        <meta name="generator" content="{atom:generator}" />
        <link rel="shortcut icon" href="/favicon.ico" />
        <style type="text/css">
        img{border:0}
        a{text-decoration:none}
        a:hover{text-decoration:underline}
        .message{border-bottom:1px dashed red} a.message:hover{cursor: help;text-decoration: none}
        dl{margin:0}
        dt{float:left;width:9em}
        dt:after{content:':'}
        </style>
      </head>
      <!-- body -->
      <xsl:text>&#10;&#10;</xsl:text>
      <body>
        <table border="1" cellpadding="3" cellspacing="0">
          <thead>
            <tr>
              <th></th>
              <th>Name</th>
              <th>Format</th>
              <xsl:if test="//planet:ignore_in_feed | //planet:filters |
                //planet:*[contains(local-name(),'_type')]">
                <th>Notes</th>
              </xsl:if>
            </tr>
          </thead>
          <xsl:apply-templates select="planet:source">
            <xsl:sort select="planet:name"/>
          </xsl:apply-templates>
          <xsl:text>&#10;</xsl:text>
        </table>
      </body>
    </html>
  </xsl:template>
  <xsl:template match="planet:source">
    <xsl:variable name="validome_format">
      <xsl:choose>
        <xsl:when test="planet:format = 'rss090'">rss_0_90</xsl:when>
        <xsl:when test="planet:format = 'rss091n'">rss_0_91</xsl:when>
        <xsl:when test="planet:format = 'rss091u'">rss_0_91</xsl:when>
        <xsl:when test="planet:format = 'rss10'">rss_1_0</xsl:when>
        <xsl:when test="planet:format = 'rss092'">rss_0_90</xsl:when>
        <xsl:when test="planet:format = 'rss093'"></xsl:when>
        <xsl:when test="planet:format = 'rss094'">rss_0_90</xsl:when>
        <xsl:when test="planet:format = 'rss20'">rss_2_0</xsl:when>
        <xsl:when test="planet:format = 'rss'">rss_2_0</xsl:when>
        <xsl:when test="planet:format = 'atom01'"></xsl:when>
        <xsl:when test="planet:format = 'atom02'"></xsl:when>
        <xsl:when test="planet:format = 'atom03'">atom_0_3</xsl:when>
        <xsl:when test="planet:format = 'atom10'">atom_1_0</xsl:when>
        <xsl:when test="planet:format = 'atom'">atom_1_0</xsl:when>
        <xsl:when test="planet:format = 'cdf'"></xsl:when>
        <xsl:when test="planet:format = 'hotrss'"></xsl:when>
      </xsl:choose>
    </xsl:variable>
    <xsl:text>&#10;</xsl:text>
    <tr>
      <xsl:if test="planet:bozo='true'">
        <xsl:attribute name="bgcolor">#FCC</xsl:attribute>
      </xsl:if>
      <td>
        <a title="feed validator">
          <xsl:attribute name="href">
            <xsl:text>http://feedvalidator.org/check?url=</xsl:text>
            <xsl:choose>
              <xsl:when test="planet:http_location">
                <xsl:value-of select="planet:http_location"/>
              </xsl:when>
              <xsl:when test="atom:link[@rel='self']/@href">
                <xsl:value-of select="atom:link[@rel='self']/@href"/>
              </xsl:when>
            </xsl:choose>
          </xsl:attribute>
          <img src="http://feedvalidator.org/favicon.ico" hspace='2' vspace='1'/>
        </a>
        <a title="validome">
          <xsl:attribute name="href">
            <xsl:text>http://www.validome.org/rss-atom/validate?</xsl:text>
            <xsl:text>viewSourceCode=1&amp;version=</xsl:text>
            <xsl:value-of select="$validome_format"/>
            <xsl:text>&amp;url=</xsl:text>
            <xsl:choose>
              <xsl:when test="planet:http_location">
                  <xsl:value-of select="planet:http_location"/>
              </xsl:when>
              <xsl:when test="atom:link[@rel='self']/@href">
                <xsl:value-of select="atom:link[@rel='self']/@href"/>
              </xsl:when>
            </xsl:choose>
          </xsl:attribute>
          <img src="http://validome.org/favicon.ico" hspace='2' vspace='1'/>
        </a>
      </td>
      <td>
        <a href="{atom:link[@rel='alternate']/@href}">
          <xsl:choose>
            <xsl:when test="planet:message">
              <xsl:attribute name="class">message</xsl:attribute>
              <xsl:attribute name="title">
                <xsl:value-of select="planet:message"/>
              </xsl:attribute>
            </xsl:when>
            <xsl:when test="atom:title">
              <xsl:attribute name="title">
                <xsl:value-of select="atom:title"/>
              </xsl:attribute>
            </xsl:when>
          </xsl:choose>
          <xsl:value-of select="planet:name"/>
        </a>
      </td>
      <td><xsl:value-of select="planet:format"/></td>
      <xsl:if test="planet:ignore_in_feed | planet:filters |
        planet:*[contains(local-name(),'_type')]">
        <td>
          <dl>
            <xsl:for-each select="planet:ignore_in_feed | planet:filters |
              planet:*[contains(local-name(),'_type')]">
              <xsl:sort select="local-name()"/>
              <dt><xsl:value-of select="local-name()"/></dt>
              <dd><xsl:value-of select="."/></dd>
            </xsl:for-each>
          </dl>
        </td>
      </xsl:if>
    </tr>
  </xsl:template>
 </xsl:stylesheet>
--- a/themes/mobile/config.ini
+++ b/themes/mobile/config.ini
@ -9,6 +9,7 @@ template_files:
  index.html.xslt
  mobile.html.xslt
  opml.xml.xslt
  validate.html.xslt
 template_directories:
  ../asf
		`@ -0,0 +1,2 @@`
							`[Planet]`
							`filters = xpath_sifter.py?require=//atom%3Acategory%5B%40term%3D%27two%27%5D`
`@ -1,2 +1,2 @@`
	`<?xml version="1.0" encoding="utf-8"?>`	`<?xml version="1.0" encoding="utf-8"?>`
	<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>http://example.com/3</id><link href="http://example.com/3" rel="alternate" type="text/html"/><title>Earth</title><summary>the Blue Planet</summary><updated planet:format="January 03, 2006 12:00 AM">2006-01-03T00:00:00Z</updated><source><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed3.rss" rel="alternate" type="text/html"/><link href="tests/data/spider/testfeed3.rss" rel="self" type="application/atom+xml"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><planet:name>three</planet:name><planet:http_status>200</planet:http_status></source></entry>	<entry xmlns="http://www.w3.org/2005/Atom" xmlns:planet="http://planet.intertwingly.net/"><id>http://example.com/3</id><link href="http://example.com/3" rel="alternate" type="text/html"/><title>Earth</title><summary>the Blue Planet</summary><updated planet:format="January 03, 2006 12:00 AM">2006-01-03T00:00:00Z</updated><source><id>http://intertwingly.net/code/venus/tests/data/spider/testfeed3.rss</id><author><name>three</name></author><link href="http://intertwingly.net/code/venus/tests/data/spider/testfeed3.rss" rel="alternate" type="text/html"/><link href="tests/data/spider/testfeed3.rss" rel="self" type="application/atom+xml"/><subtitle>It’s just data</subtitle><title>Sam Ruby</title><updated planet:format="October 14, 2006 01:02 PM">2006-10-14T13:02:18Z</updated><planet:format>rss20</planet:format><planet:name>three</planet:name><planet:bozo>true</planet:bozo><planet:http_status>200</planet:http_status></source></entry>