Mega merge from Sam.
This commit is contained in:
commit
56ee34a7f0
4
THANKS
4
THANKS
@ -9,8 +9,10 @@ Harry Fuecks - Pipe characters in file names, filter bug
|
||||
Eric van der Vlist - Filters to add language, category information
|
||||
Chris Dolan - mkdir cache; default template_dirs; fix xsltproc
|
||||
David Sifry - rss 2.0 xslt template based on http://atom.geekhood.net/
|
||||
Morten Fredericksen - Support WordPress LinkManager OPML
|
||||
Morten Frederiksen - Support WordPress LinkManager OPML
|
||||
Harry Fuecks - default item date to feed date
|
||||
Antonio Cavedoni - Django templates
|
||||
Morten Frederiksen - expungeCache
|
||||
|
||||
This codebase represents a radical refactoring of Planet 2.0, which lists
|
||||
the following contributors:
|
||||
|
7
TODO
7
TODO
@ -1,13 +1,6 @@
|
||||
TODO
|
||||
====
|
||||
|
||||
* Expire feed history
|
||||
|
||||
The feed cache doesn't currently expire old entries, so could get
|
||||
large quite rapidly. We should probably have a config setting for
|
||||
the cache expiry, the trouble is some channels might need a longer
|
||||
or shorter one than others.
|
||||
|
||||
* Allow display normalisation to specified timezone
|
||||
|
||||
Some Planet admins would like their feed to be displayed in the local
|
||||
|
@ -61,8 +61,13 @@ material information.</dd>
|
||||
can be found</dd>
|
||||
<dt><ins>bill_of_materials</ins></dt>
|
||||
<dd>Space-separated list of files to be copied as is directly from the <code>template_directories</code> to the <code>output_dir</code></dd>
|
||||
<dt>filter</dt>
|
||||
<dd>Regular expression that must be found in the textual portion of the entry</dd>
|
||||
<dt>exclude</dt>
|
||||
<dd>Regular expression that must <b>not</b> be found in the textual portion of the entry</dd>
|
||||
<dt><ins>filters</ins></dt>
|
||||
<dd>Space-separated list of filters to apply to each entry</dd>
|
||||
<dd>Space-separated list of <a href="filters.html">filters</a> to apply to
|
||||
each entry</dd>
|
||||
|
||||
</dl>
|
||||
<dl class="compact code">
|
||||
@ -96,8 +101,8 @@ use for logging output. Note: this configuration value is processed
|
||||
<a href="http://docs.python.org/lib/ConfigParser-objects.html">raw</a></dd>
|
||||
<dt>feed_timeout</dt>
|
||||
<dd>Number of seconds to wait for any given feed</dd>
|
||||
<dt><del>new_feed_items</del></dt>
|
||||
<dd>Number of items to take from new feeds</dd>
|
||||
<dt>new_feed_items</dt>
|
||||
<dd>Maximum number of items to include in the output from any one feed</dd>
|
||||
<dt><ins>spider_threads</ins></dt>
|
||||
<dd>The number of threads to use when spidering. When set to 0, the default,
|
||||
no threads are used and spidering follows the traditional algorithm.</dd>
|
||||
@ -106,6 +111,10 @@ no threads are used and spidering follows the traditional algorithm.</dd>
|
||||
directory to be used for an additional HTTP cache to front end the Venus
|
||||
cache. If specified as a relative path, it is evaluated relative to the
|
||||
<code>cache_directory</code>.</dd>
|
||||
<dt><ins>cache_keep_entries</ins></dt>
|
||||
<dd>Used by <code>expunge</code> to determine how many entries should be
|
||||
kept for each source when expunging old entries from the cache directory.
|
||||
This may be overriden on a per subscription feed basis.</dd>
|
||||
</dl>
|
||||
<p>Additional options can be found in
|
||||
<a href="normalization.html#overrides">normalization level overrides</a>.</p>
|
||||
|
67
docs/contributing.html
Normal file
67
docs/contributing.html
Normal file
@ -0,0 +1,67 @@
|
||||
<!DOCTYPE html PUBLIC
|
||||
"-//W3C//DTD XHTML 1.1 plus MathML 2.0 plus SVG 1.1//EN"
|
||||
"http://www.w3.org/2002/04/xhtml-math-svg/xhtml-math-svg.dtd">
|
||||
<html xmlns="http://www.w3.org/1999/xhtml">
|
||||
<head>
|
||||
<script type="text/javascript" src="docs.js"></script>
|
||||
<link rel="stylesheet" type="text/css" href="docs.css"/>
|
||||
<title>Contributing</title>
|
||||
</head>
|
||||
<body>
|
||||
<h2>Contributing</h2>
|
||||
<p>If you make changes to Venus, you have no obligation to share them.
|
||||
And unlike systems based on <code>CVS</code> or <code>subversion</code>,
|
||||
there is no notion of “committers” — everybody is
|
||||
a peer.</p>
|
||||
<p>If you should chose to share your changes, the steps outlined below may
|
||||
increase your changes of your code being picked up.</p>
|
||||
|
||||
<h3>Documentation and Tests</h3>
|
||||
<p>For best results, include both documentation and tests in your
|
||||
contribution.</p>
|
||||
<p>Documentation can be found in the <code>docs</code> directory. It is
|
||||
straight XHTML.</p>
|
||||
<p>Test cases can be found in the
|
||||
<a href="http://localhost/~rubys/venus/tests/">tests</a> directory, and
|
||||
make use of the
|
||||
<a href="http://docs.python.org/lib/module-unittest.html">Python Unit testing framework</a>. To run them, simply enter:</p>
|
||||
<blockquote><pre>python runtests.py</pre></blockquote>
|
||||
|
||||
<h3>Bzr</h3>
|
||||
<p>If you have done a <a href="index.html">bzr get</a>, you have already set up
|
||||
a repository. The only additional step you might need to do is to introduce
|
||||
yourself to <a href="http://bazaar-vcs.org/">bzr</a>. Type in the following,
|
||||
after replacing the <b>bold text</b> with your information:</p>
|
||||
|
||||
<blockquote><pre>bzr whoami '<b>Your Name</b> <<b>youremail</b>@<b>example.com</b>>'</pre></blockquote>
|
||||
|
||||
<p>Then, simply make the changes you like. When you are done, type:</p>
|
||||
|
||||
<blockquote><pre>bzr st</pre></blockquote>
|
||||
|
||||
<p>This will tell you which files you have modified, and which ones you may
|
||||
have added. If you add files and you want them to be included, simply do a:</p>
|
||||
|
||||
<blockquote><pre>bzr add file1 file2...</pre></blockquote>
|
||||
|
||||
<p>You can also do a <code>bzr diff</code> to see if there are any changes
|
||||
which you made that you don't want included. I can't tell you how many
|
||||
debug print statements I have caught this way.</p>
|
||||
|
||||
<p>Next, type:</p>
|
||||
|
||||
<blockquote><pre>bzr commit</pre></blockquote>
|
||||
|
||||
<p>This will allow you to enter a comment describing your change. If your
|
||||
repository is already on your web server, simple let others know where they
|
||||
can find it. If not, you can simply ftp or scp the files to your web server
|
||||
— no additional software needs to be installed on that machine.</p>
|
||||
|
||||
<h3>Telling others</h3>
|
||||
<p>Once you have a change worth sharing, post a message on the
|
||||
<a href="http://lists.planetplanet.org/mailman/listinfo/devel">mailing list</a>.</p>
|
||||
<p>Also, consider setting up a <a href="http://bzr.mfd-consult.dk/bzr-feed/">bzr-feed</a> for your repository, so people who wish to do so can automatically
|
||||
be notified of every change.</p>
|
||||
<p>There now is even an nascent <a href="http://planet.intertwingly.net/venus/">planet</a> being formed which combines these feeds of changes. You can <a href="http://planet.intertwingly.net/venus/atom.xml">subscribe</a> to it too.</p>
|
||||
</body>
|
||||
</html>
|
@ -13,7 +13,7 @@
|
||||
parameters come from the config file, and output goes to <code>stdout</code>.
|
||||
Anything written to <code>stderr</code> is logged as an ERROR message. If no
|
||||
<code>stdout</code> is produced, the entry is not written to the cache or
|
||||
processed further.</p>
|
||||
processed further; in fact, if the entry had previously been written to the cache, it will be removed.</p>
|
||||
|
||||
<p>Input to a filter is a aggressively
|
||||
<a href="normalization.html">normalized</a> entry. For
|
||||
@ -46,9 +46,26 @@ expressions. Again, parameters can be passed as
|
||||
<a href="../tests/data/filter/xpath-sifter2.ini">URI style</a>.
|
||||
</p>
|
||||
|
||||
<p>The <a href="../filters/regexp_sifter.py">regexp sifter</a> operates just
|
||||
like the xpath sifter, except it uses
|
||||
<a href="http://docs.python.org/lib/re-syntax.html">regular expressions</a>
|
||||
instead of XPath expressions.</p>
|
||||
|
||||
<h3>Notes</h3>
|
||||
|
||||
<ul>
|
||||
<li>Filters are executed when a feed is fetched, and the results are placed
|
||||
into the cache. Changing a configuration file alone is not sufficient to
|
||||
change the contents of the cache — typically that only occurs after
|
||||
a feed is modified.</li>
|
||||
|
||||
<li>Filters are simply invoked in the order they are listed in the
|
||||
configuration file (think unix pipes). Planet wide filters are executed before
|
||||
feed specific filters.</li>
|
||||
|
||||
<li>Any filters listed in the <code>[planet]</code> section of your config.ini
|
||||
will be invoked on all feeds. Filters listed in individual
|
||||
<code>[feed]</code> sections will only be invoked on those feeds.</li>
|
||||
|
||||
<li>The file extension of the filter is significant. <code>.py</code> invokes
|
||||
python. <code>.xslt</code> involkes XSLT. <code>.sed</code> and
|
||||
@ -56,14 +73,6 @@ python. <code>.xslt</code> involkes XSLT. <code>.sed</code> and
|
||||
perl or ruby or class/jar (java), aren't supported at the moment, but these
|
||||
would be easy to add.</li>
|
||||
|
||||
<li>Any filters listed in the <code>[planet]</code> section of your config.ini
|
||||
will be invoked on all feeds. Filters listed in individual
|
||||
<code>[feed]</code> sections will only be invoked on those feeds.</li>
|
||||
|
||||
<li>Filters are simply invoked in the order they are listed in the
|
||||
configuration file (think unix pipes). Planet wide filters are executed before
|
||||
feed specific filters.</li>
|
||||
|
||||
<li>Templates written using htmltmpl currently only have access to a fixed set
|
||||
of fields, whereas XSLT templates have access to everything.</li>
|
||||
</ul>
|
||||
|
@ -27,6 +27,7 @@
|
||||
<li>Other
|
||||
<ul>
|
||||
<li><a href="migration.html">Migration from Planet 2.0</a></li>
|
||||
<li><a href="contributing.html">Contributing</a></li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>Reference
|
||||
@ -38,6 +39,7 @@
|
||||
<li><a href="http://bitworking.org/projects/httplib2/">httplib2</a></li>
|
||||
<li><a href="http://www.w3.org/TR/xslt">XSLT</a></li>
|
||||
<li><a href="http://www.gnu.org/software/sed/manual/html_mono/sed.html">sed</a></li>
|
||||
<li><a href="http://www.djangoproject.com/documentation/templates/">Django templates</a></li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>Credits and License
|
||||
|
@ -107,6 +107,15 @@ not yet ported to the newer python so Venus will be less featureful.
|
||||
|
||||
<blockquote><pre>sudo apt-get install bzr python2.4-librdf</pre></blockquote>
|
||||
|
||||
<h3 id="windows">Windows instructions</h3>
|
||||
|
||||
<p>
|
||||
htmltmpl templates (and Django too, since it currently piggybacks on
|
||||
the htmltmpl implementation) on Windows require
|
||||
the <a href="http://sourceforge.net/projects/pywin32/">pywin32</a>
|
||||
module.
|
||||
</p>
|
||||
|
||||
<h3 id="python22">Python 2.2 instructions</h3>
|
||||
|
||||
<p>If you are running Python 2.2, you may also need to install <a href="http://pyxml.sourceforge.net/">pyxml</a>. If the
|
||||
|
@ -101,6 +101,48 @@ The data values within the <code>Items</code> array are as follows:</p>
|
||||
<code>new_</code> are only set if their values differ from the previous
|
||||
Item.</p>
|
||||
|
||||
<h3>django</h3>
|
||||
|
||||
<p>
|
||||
If you have the <a href="http://www.djangoproject.com/">Django</a>
|
||||
framework installed,
|
||||
<a href="http://www.djangoproject.com/documentation/templates/"
|
||||
>Django templates</a> are automatically available to Venus
|
||||
projects. You will have to save them with a <code>.html.dj</code>
|
||||
extension in your themes. The variable set is the same as the one
|
||||
from htmltmpl, above. In the Django template context you'll have
|
||||
access to <code>Channels</code> and <code>Items</code> and you'll be
|
||||
able to iterate through them.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
You also have access to the <code>Config</code> dictionary, which contains
|
||||
the Venus configuration variables from your <code>.ini</code> file.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
If you lose your way and want to introspect all the variable in the
|
||||
context, there's the useful <code>{% debug %}</code> template tag.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
In the <code>themes/django/</code> you'll find a sample Venus theme
|
||||
that uses the Django templates that might be a starting point for
|
||||
your own custom themes.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
All the standard Django template tags and filter are supposed to
|
||||
work, with the notable exception of the <code>date</code> filter on
|
||||
the updated and published dates of an item (it works on the main
|
||||
<code>{{ date }}</code> variable).
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Please note that Django, and therefore Venus' Django support,
|
||||
requires at least Python 2.3.
|
||||
</p>
|
||||
|
||||
<h3>xslt</h3>
|
||||
<p><a href="http://www.w3.org/TR/xslt">XSLT</a> is a paradox: it actually
|
||||
makes some simple things easier to do than htmltmpl, and certainly can
|
||||
|
17
expunge.py
Normal file
17
expunge.py
Normal file
@ -0,0 +1,17 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
Main program to run just the expunge portion of planet
|
||||
"""
|
||||
|
||||
import os.path
|
||||
import sys
|
||||
from planet import expunge, config
|
||||
|
||||
if __name__ == '__main__':
|
||||
|
||||
if len(sys.argv) == 2 and os.path.isfile(sys.argv[1]):
|
||||
config.load(sys.argv[1])
|
||||
expunge.expungeCache()
|
||||
else:
|
||||
print "Usage:"
|
||||
print " python %s config.ini" % sys.argv[0]
|
25
filters/detitle.xslt
Normal file
25
filters/detitle.xslt
Normal file
@ -0,0 +1,25 @@
|
||||
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
|
||||
xmlns:atom="http://www.w3.org/2005/Atom"
|
||||
xmlns="http://www.w3.org/1999/xhtml">
|
||||
|
||||
<!-- only retain titles that don't duplicate summary or content -->
|
||||
<xsl:template match="atom:title">
|
||||
<xsl:if test="string-length(.) < 30 or
|
||||
( substring(.,1,string-length(.)-3) !=
|
||||
substring(../atom:content,1,string-length(.)-3) and
|
||||
substring(.,1,string-length(.)-3) !=
|
||||
substring(../atom:summary,1,string-length(.)-3) )">
|
||||
<xsl:copy>
|
||||
<xsl:apply-templates select="@*|node()"/>
|
||||
</xsl:copy>
|
||||
</xsl:if>
|
||||
</xsl:template>
|
||||
|
||||
<!-- pass through everything else -->
|
||||
<xsl:template match="@*|node()">
|
||||
<xsl:copy>
|
||||
<xsl:apply-templates select="@*|node()"/>
|
||||
</xsl:copy>
|
||||
</xsl:template>
|
||||
|
||||
</xsl:stylesheet>
|
30
filters/h1title.xslt
Normal file
30
filters/h1title.xslt
Normal file
@ -0,0 +1,30 @@
|
||||
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
|
||||
xmlns:atom="http://www.w3.org/2005/Atom"
|
||||
xmlns:xhtml="http://www.w3.org/1999/xhtml">
|
||||
|
||||
<!-- Replace title with value of h1, if present -->
|
||||
<xsl:template match="atom:title">
|
||||
<xsl:apply-templates select="@*"/>
|
||||
<xsl:copy>
|
||||
<xsl:choose>
|
||||
<xsl:when test="count(//xhtml:h1) = 1">
|
||||
<xsl:value-of select="normalize-space(//xhtml:h1)"/>
|
||||
</xsl:when>
|
||||
<xsl:otherwise>
|
||||
<xsl:apply-templates select="node()"/>
|
||||
</xsl:otherwise>
|
||||
</xsl:choose>
|
||||
</xsl:copy>
|
||||
</xsl:template>
|
||||
|
||||
<!-- Remove all h1s -->
|
||||
<xsl:template match="xhtml:h1"/>
|
||||
|
||||
<!-- pass through everything else -->
|
||||
<xsl:template match="@*|node()">
|
||||
<xsl:copy>
|
||||
<xsl:apply-templates select="@*|node()"/>
|
||||
</xsl:copy>
|
||||
</xsl:template>
|
||||
|
||||
</xsl:stylesheet>
|
44
filters/regexp_sifter.py
Normal file
44
filters/regexp_sifter.py
Normal file
@ -0,0 +1,44 @@
|
||||
import sys, re
|
||||
|
||||
# parse options
|
||||
options = dict(zip(sys.argv[1::2],sys.argv[2::2]))
|
||||
|
||||
# read entry
|
||||
doc = data = sys.stdin.read()
|
||||
|
||||
# Apply a sequence of patterns which turn a normalized Atom entry into
|
||||
# a stream of text, after removal of non-human metadata.
|
||||
for pattern,replacement in [
|
||||
(re.compile('<id>.*?</id>'),' '),
|
||||
(re.compile('<url>.*?</url>'),' '),
|
||||
(re.compile('<source>.*?</source>'),' '),
|
||||
(re.compile('<updated.*?</updated>'),' '),
|
||||
(re.compile('<published.*?</published>'),' '),
|
||||
(re.compile('<link .*?>'),' '),
|
||||
(re.compile('''<[^>]* alt=['"]([^'"]*)['"].*?>'''),r' \1 '),
|
||||
(re.compile('''<[^>]* title=['"]([^'"]*)['"].*?>'''),r' \1 '),
|
||||
(re.compile('''<[^>]* label=['"]([^'"]*)['"].*?>'''),r' \1 '),
|
||||
(re.compile('''<[^>]* term=['"]([^'"]*)['"].*?>'''),r' \1 '),
|
||||
(re.compile('<.*?>'),' '),
|
||||
(re.compile('\s+'),' '),
|
||||
(re.compile('>'),'>'),
|
||||
(re.compile('<'),'<'),
|
||||
(re.compile('''),"'"),
|
||||
(re.compile('"'),'"'),
|
||||
(re.compile('&'),'&'),
|
||||
(re.compile('\s+'),' ')
|
||||
]:
|
||||
data=pattern.sub(replacement,data)
|
||||
|
||||
# process requirements
|
||||
if options.has_key('--require'):
|
||||
for regexp in options['--require'].split('\n'):
|
||||
if regexp and not re.search(regexp,data): sys.exit(1)
|
||||
|
||||
# process exclusions
|
||||
if options.has_key('--exclude'):
|
||||
for regexp in options['--exclude'].split('\n'):
|
||||
if regexp and re.search(regexp,data): sys.exit(1)
|
||||
|
||||
# if we get this far, the feed is to be included
|
||||
print doc
|
@ -21,6 +21,7 @@ if __name__ == "__main__":
|
||||
offline = 0
|
||||
verbose = 0
|
||||
only_if_new = 0
|
||||
expunge = 0
|
||||
|
||||
for arg in sys.argv[1:]:
|
||||
if arg == "-h" or arg == "--help":
|
||||
@ -31,6 +32,7 @@ if __name__ == "__main__":
|
||||
print " -o, --offline Update the Planet from the cache only"
|
||||
print " -h, --help Display this help message and exit"
|
||||
print " -n, --only-if-new Only spider new feeds"
|
||||
print " -x, --expunge Expunge old entries from cache"
|
||||
print
|
||||
sys.exit(0)
|
||||
elif arg == "-v" or arg == "--verbose":
|
||||
@ -39,6 +41,8 @@ if __name__ == "__main__":
|
||||
offline = 1
|
||||
elif arg == "-n" or arg == "--only-if-new":
|
||||
only_if_new = 1
|
||||
elif arg == "-x" or arg == "--expunge":
|
||||
expunge = 1
|
||||
elif arg.startswith("-"):
|
||||
print >>sys.stderr, "Unknown option:", arg
|
||||
sys.exit(1)
|
||||
@ -62,3 +66,7 @@ if __name__ == "__main__":
|
||||
from planet import splice
|
||||
doc = splice.splice()
|
||||
splice.apply(doc.toxml('utf-8'))
|
||||
|
||||
if expunge:
|
||||
from planet import expunge
|
||||
expunge.expungeCache
|
||||
|
@ -26,7 +26,7 @@ Todo:
|
||||
* error handling (example: no planet section)
|
||||
"""
|
||||
|
||||
import os, sys, re
|
||||
import os, sys, re, urllib
|
||||
from ConfigParser import ConfigParser
|
||||
from urlparse import urljoin
|
||||
|
||||
@ -106,7 +106,9 @@ def __init__():
|
||||
define_planet('output_dir', 'output')
|
||||
define_planet('spider_threads', 0)
|
||||
|
||||
define_planet_int('new_feed_items', 0)
|
||||
define_planet_int('feed_timeout', 20)
|
||||
define_planet_int('cache_keep_entries', 10)
|
||||
|
||||
define_planet_list('template_files')
|
||||
define_planet_list('bill_of_materials')
|
||||
@ -126,6 +128,8 @@ def __init__():
|
||||
define_tmpl('content_type', '')
|
||||
define_tmpl('future_dates', 'keep')
|
||||
define_tmpl('xml_base', '')
|
||||
define_tmpl('filter', None)
|
||||
define_tmpl('exclude', None)
|
||||
|
||||
def load(config_file):
|
||||
""" initialize and load a configuration"""
|
||||
@ -330,7 +334,7 @@ def feedtype():
|
||||
|
||||
def subscriptions():
|
||||
""" list the feed subscriptions """
|
||||
return filter(lambda feed: feed!='Planet' and
|
||||
return __builtins__['filter'](lambda feed: feed!='Planet' and
|
||||
feed not in template_files()+filters()+reading_lists(),
|
||||
parser.sections())
|
||||
|
||||
@ -350,6 +354,12 @@ def filters(section=None):
|
||||
filters += parser.get('Planet', 'filters').split()
|
||||
if section and parser.has_option(section, 'filters'):
|
||||
filters += parser.get(section, 'filters').split()
|
||||
if filter(section):
|
||||
filters.append('regexp_sifter.py?require=' +
|
||||
urllib.quote(filter(section)))
|
||||
if exclude(section):
|
||||
filters.append('regexp_sifter.py?exclude=' +
|
||||
urllib.quote(filter(section)))
|
||||
return filters
|
||||
|
||||
def planet_options():
|
||||
|
68
planet/expunge.py
Normal file
68
planet/expunge.py
Normal file
@ -0,0 +1,68 @@
|
||||
""" Expunge old entries from a cache of entries """
|
||||
import glob, os, planet, config, feedparser
|
||||
from xml.dom import minidom
|
||||
from spider import filename
|
||||
|
||||
def expungeCache():
|
||||
""" Expunge old entries from a cache of entries """
|
||||
import planet
|
||||
log = planet.getLogger(config.log_level(),config.log_format())
|
||||
|
||||
log.info("Determining feed subscriptions")
|
||||
entry_count = {}
|
||||
sources = config.cache_sources_directory()
|
||||
for sub in config.subscriptions():
|
||||
data=feedparser.parse(filename(sources,sub))
|
||||
if not data.feed.has_key('id'): continue
|
||||
if config.feed_options(sub).has_key('cache_keep_entries'):
|
||||
entry_count[data.feed.id] = int(config.feed_options(sub)['cache_keep_entries'])
|
||||
else:
|
||||
entry_count[data.feed.id] = config.cache_keep_entries()
|
||||
|
||||
log.info("Listing cached entries")
|
||||
cache = config.cache_directory()
|
||||
dir=[(os.stat(file).st_mtime,file) for file in glob.glob(cache+"/*")
|
||||
if not os.path.isdir(file)]
|
||||
dir.sort()
|
||||
dir.reverse()
|
||||
|
||||
for mtime,file in dir:
|
||||
|
||||
try:
|
||||
entry=minidom.parse(file)
|
||||
# determine source of entry
|
||||
entry.normalize()
|
||||
sources = entry.getElementsByTagName('source')
|
||||
if not sources:
|
||||
# no source determined, do not delete
|
||||
log.debug("No source found for %s", file)
|
||||
continue
|
||||
ids = sources[0].getElementsByTagName('id')
|
||||
if not ids:
|
||||
# feed id not found, do not delete
|
||||
log.debug("No source feed id found for %s", file)
|
||||
continue
|
||||
if ids[0].childNodes[0].nodeValue in entry_count:
|
||||
# subscribed to feed, update entry count
|
||||
entry_count[ids[0].childNodes[0].nodeValue] = entry_count[
|
||||
ids[0].childNodes[0].nodeValue] - 1
|
||||
if entry_count[ids[0].childNodes[0].nodeValue] >= 0:
|
||||
# maximum not reached, do not delete
|
||||
log.debug("Maximum not reached for %s from %s",
|
||||
file, ids[0].childNodes[0].nodeValue)
|
||||
continue
|
||||
else:
|
||||
# maximum reached
|
||||
log.debug("Removing %s, maximum reached for %s",
|
||||
file, ids[0].childNodes[0].nodeValue)
|
||||
else:
|
||||
# not subscribed
|
||||
log.debug("Removing %s, not subscribed to %s",
|
||||
file, ids[0].childNodes[0].nodeValue)
|
||||
# remove old entry
|
||||
os.unlink(file)
|
||||
|
||||
except:
|
||||
log.error("Error parsing %s", file)
|
||||
|
||||
# end of expungeCache()
|
@ -9,26 +9,7 @@ Example usage:
|
||||
import html5lib
|
||||
f = open("my_document.html")
|
||||
p = html5lib.HTMLParser()
|
||||
tree = p.parse(f)
|
||||
|
||||
By default the returned treeformat is a custom "simpletree", similar
|
||||
to a DOM tree; each element has attributes childNodes and parent
|
||||
holding the parents and children respectively, a name attribute
|
||||
holding the Element name, a data attribute holding the element data
|
||||
(for text and comment nodes) and an attributes dictionary holding the
|
||||
element's attributes (for Element nodes).
|
||||
|
||||
To get output in ElementTree format:
|
||||
|
||||
import html5lib
|
||||
from html5lib.treebuilders import etree
|
||||
p = html5lib.HTMLParser(tree=etree.TreeBuilder)
|
||||
elementtree = p.parse(f)
|
||||
|
||||
Note: Because HTML documents support various features not in the
|
||||
default ElementTree (e.g. doctypes), we suppy our own simple
|
||||
serializer; html5lib.treebuilders.etree.tostring At present this does not
|
||||
have the encoding support offered by the elementtree serializer.
|
||||
|
||||
tree = p.parse(f)
|
||||
"""
|
||||
from html5parser import HTMLParser
|
||||
from liberalxmlparser import XMLParser, XHTMLParser
|
||||
|
@ -112,7 +112,8 @@ spaceCharacters = frozenset((
|
||||
u"\n",
|
||||
u"\u000B",
|
||||
u"\u000C",
|
||||
u" "
|
||||
u" ",
|
||||
u"\r"
|
||||
))
|
||||
|
||||
tableInsertModeElements = frozenset((
|
||||
@ -124,6 +125,7 @@ tableInsertModeElements = frozenset((
|
||||
))
|
||||
|
||||
asciiLowercase = frozenset(string.ascii_lowercase)
|
||||
asciiUppercase = frozenset(string.ascii_uppercase)
|
||||
asciiLetters = frozenset(string.ascii_letters)
|
||||
digits = frozenset(string.digits)
|
||||
hexDigits = frozenset(string.hexdigits)
|
||||
@ -454,3 +456,222 @@ entities = {
|
||||
"zwj": u"\u200D",
|
||||
"zwnj": u"\u200C"
|
||||
}
|
||||
|
||||
encodings = frozenset((
|
||||
"ansi_x3.4-1968",
|
||||
"iso-ir-6",
|
||||
"ansi_x3.4-1986",
|
||||
"iso_646.irv:1991",
|
||||
"ascii",
|
||||
"iso646-us",
|
||||
"us-ascii",
|
||||
"us",
|
||||
"ibm367",
|
||||
"cp367",
|
||||
"csascii",
|
||||
"ks_c_5601-1987",
|
||||
"korean",
|
||||
"iso-2022-kr",
|
||||
"csiso2022kr",
|
||||
"euc-kr",
|
||||
"iso-2022-jp",
|
||||
"csiso2022jp",
|
||||
"iso-2022-jp-2",
|
||||
"iso-ir-58",
|
||||
"chinese",
|
||||
"csiso58gb231280",
|
||||
"iso_8859-1:1987",
|
||||
"iso-ir-100",
|
||||
"iso_8859-1",
|
||||
"iso-8859-1",
|
||||
"latin1",
|
||||
"l1",
|
||||
"ibm819",
|
||||
"cp819",
|
||||
"csisolatin1",
|
||||
"iso_8859-2:1987",
|
||||
"iso-ir-101",
|
||||
"iso_8859-2",
|
||||
"iso-8859-2",
|
||||
"latin2",
|
||||
"l2",
|
||||
"csisolatin2",
|
||||
"iso_8859-3:1988",
|
||||
"iso-ir-109",
|
||||
"iso_8859-3",
|
||||
"iso-8859-3",
|
||||
"latin3",
|
||||
"l3",
|
||||
"csisolatin3",
|
||||
"iso_8859-4:1988",
|
||||
"iso-ir-110",
|
||||
"iso_8859-4",
|
||||
"iso-8859-4",
|
||||
"latin4",
|
||||
"l4",
|
||||
"csisolatin4",
|
||||
"iso_8859-6:1987",
|
||||
"iso-ir-127",
|
||||
"iso_8859-6",
|
||||
"iso-8859-6",
|
||||
"ecma-114",
|
||||
"asmo-708",
|
||||
"arabic",
|
||||
"csisolatinarabic",
|
||||
"iso_8859-7:1987",
|
||||
"iso-ir-126",
|
||||
"iso_8859-7",
|
||||
"iso-8859-7",
|
||||
"elot_928",
|
||||
"ecma-118",
|
||||
"greek",
|
||||
"greek8",
|
||||
"csisolatingreek",
|
||||
"iso_8859-8:1988",
|
||||
"iso-ir-138",
|
||||
"iso_8859-8",
|
||||
"iso-8859-8",
|
||||
"hebrew",
|
||||
"csisolatinhebrew",
|
||||
"iso_8859-5:1988",
|
||||
"iso-ir-144",
|
||||
"iso_8859-5",
|
||||
"iso-8859-5",
|
||||
"cyrillic",
|
||||
"csisolatincyrillic",
|
||||
"iso_8859-9:1989",
|
||||
"iso-ir-148",
|
||||
"iso_8859-9",
|
||||
"iso-8859-9",
|
||||
"latin5",
|
||||
"l5",
|
||||
"csisolatin5",
|
||||
"iso-8859-10",
|
||||
"iso-ir-157",
|
||||
"l6",
|
||||
"iso_8859-10:1992",
|
||||
"csisolatin6",
|
||||
"latin6",
|
||||
"hp-roman8",
|
||||
"roman8",
|
||||
"r8",
|
||||
"ibm037",
|
||||
"cp037",
|
||||
"csibm037",
|
||||
"ibm424",
|
||||
"cp424",
|
||||
"csibm424",
|
||||
"ibm437",
|
||||
"cp437",
|
||||
"437",
|
||||
"cspc8codepage437",
|
||||
"ibm500",
|
||||
"cp500",
|
||||
"csibm500",
|
||||
"ibm775",
|
||||
"cp775",
|
||||
"cspc775baltic",
|
||||
"ibm850",
|
||||
"cp850",
|
||||
"850",
|
||||
"cspc850multilingual",
|
||||
"ibm852",
|
||||
"cp852",
|
||||
"852",
|
||||
"cspcp852",
|
||||
"ibm855",
|
||||
"cp855",
|
||||
"855",
|
||||
"csibm855",
|
||||
"ibm857",
|
||||
"cp857",
|
||||
"857",
|
||||
"csibm857",
|
||||
"ibm860",
|
||||
"cp860",
|
||||
"860",
|
||||
"csibm860",
|
||||
"ibm861",
|
||||
"cp861",
|
||||
"861",
|
||||
"cp-is",
|
||||
"csibm861",
|
||||
"ibm862",
|
||||
"cp862",
|
||||
"862",
|
||||
"cspc862latinhebrew",
|
||||
"ibm863",
|
||||
"cp863",
|
||||
"863",
|
||||
"csibm863",
|
||||
"ibm864",
|
||||
"cp864",
|
||||
"csibm864",
|
||||
"ibm865",
|
||||
"cp865",
|
||||
"865",
|
||||
"csibm865",
|
||||
"ibm866",
|
||||
"cp866",
|
||||
"866",
|
||||
"csibm866",
|
||||
"ibm869",
|
||||
"cp869",
|
||||
"869",
|
||||
"cp-gr",
|
||||
"csibm869",
|
||||
"ibm1026",
|
||||
"cp1026",
|
||||
"csibm1026",
|
||||
"koi8-r",
|
||||
"cskoi8r",
|
||||
"koi8-u",
|
||||
"big5-hkscs",
|
||||
"ptcp154",
|
||||
"csptcp154",
|
||||
"pt154",
|
||||
"cp154",
|
||||
"utf-7",
|
||||
"utf-16be",
|
||||
"utf-16le",
|
||||
"utf-16",
|
||||
"utf-8",
|
||||
"iso-8859-13",
|
||||
"iso-8859-14",
|
||||
"iso-ir-199",
|
||||
"iso_8859-14:1998",
|
||||
"iso_8859-14",
|
||||
"latin8",
|
||||
"iso-celtic",
|
||||
"l8",
|
||||
"iso-8859-15",
|
||||
"iso_8859-15",
|
||||
"iso-8859-16",
|
||||
"iso-ir-226",
|
||||
"iso_8859-16:2001",
|
||||
"iso_8859-16",
|
||||
"latin10",
|
||||
"l10",
|
||||
"gbk",
|
||||
"cp936",
|
||||
"ms936",
|
||||
"gb18030",
|
||||
"shift_jis",
|
||||
"ms_kanji",
|
||||
"csshiftjis",
|
||||
"euc-jp",
|
||||
"gb2312",
|
||||
"big5",
|
||||
"csbig5",
|
||||
"windows-1250",
|
||||
"windows-1251",
|
||||
"windows-1252",
|
||||
"windows-1253",
|
||||
"windows-1254",
|
||||
"windows-1255",
|
||||
"windows-1256",
|
||||
"windows-1257",
|
||||
"windows-1258",
|
||||
"tis-620",
|
||||
"hz-gb-2312",
|
||||
))
|
@ -840,7 +840,8 @@ class InBodyPhase(Phase):
|
||||
self.tree.insertElement(name, attributes)
|
||||
|
||||
def endTagP(self, name):
|
||||
self.tree.generateImpliedEndTags("p")
|
||||
if self.tree.elementInScope("p"):
|
||||
self.tree.generateImpliedEndTags("p")
|
||||
if self.tree.openElements[-1].name != "p":
|
||||
self.parser.parseError("Unexpected end tag (p).")
|
||||
while self.tree.elementInScope("p"):
|
||||
@ -1150,7 +1151,8 @@ class InTablePhase(Phase):
|
||||
self.parser.phase.processStartTag(name, attributes)
|
||||
|
||||
def startTagTable(self, name, attributes):
|
||||
self.parser.parseError()
|
||||
self.parser.parseError(_(u"Unexpected start tag (table) in table "
|
||||
u"phase. Implies end tag (table)."))
|
||||
self.parser.phase.processEndTag("table")
|
||||
if not self.parser.innerHTML:
|
||||
self.parser.phase.processStartTag(name, attributes)
|
||||
@ -1168,14 +1170,16 @@ class InTablePhase(Phase):
|
||||
if self.tree.elementInScope("table", True):
|
||||
self.tree.generateImpliedEndTags()
|
||||
if self.tree.openElements[-1].name != "table":
|
||||
self.parser.parseError()
|
||||
self.parser.parseError(_(u"Unexpected end tag (table). "
|
||||
u"Expected end tag (" + self.tree.openElements[-1].name +\
|
||||
u")."))
|
||||
while self.tree.openElements[-1].name != "table":
|
||||
self.tree.openElements.pop()
|
||||
self.tree.openElements.pop()
|
||||
self.parser.resetInsertionMode()
|
||||
else:
|
||||
self.parser.parseError()
|
||||
# innerHTML case
|
||||
self.parser.parseError()
|
||||
|
||||
def endTagIgnore(self, name):
|
||||
self.parser.parseError(_("Unexpected end tag (" + name +\
|
||||
@ -1787,7 +1791,7 @@ class TrailingEndPhase(Phase):
|
||||
pass
|
||||
|
||||
def processComment(self, data):
|
||||
self.parser.insertCommenr(data, self.tree.document)
|
||||
self.tree.insertComment(data, self.tree.document)
|
||||
|
||||
def processSpaceCharacters(self, data):
|
||||
self.parser.lastPhase.processSpaceCharacters(data)
|
||||
|
@ -1,7 +1,10 @@
|
||||
import codecs
|
||||
import re
|
||||
import types
|
||||
|
||||
from constants import EOF
|
||||
from constants import EOF, spaceCharacters, asciiLetters, asciiUppercase
|
||||
from constants import encodings
|
||||
from utils import MethodDispatcher
|
||||
|
||||
class HTMLInputStream(object):
|
||||
"""Provides a unicode stream of characters to the HTMLTokenizer.
|
||||
@ -11,7 +14,7 @@ class HTMLInputStream(object):
|
||||
|
||||
"""
|
||||
|
||||
def __init__(self, source, encoding=None):
|
||||
def __init__(self, source, encoding=None, chardet=True):
|
||||
"""Initialises the HTMLInputStream.
|
||||
|
||||
HTMLInputStream(source, [encoding]) -> Normalized stream from source
|
||||
@ -28,33 +31,30 @@ class HTMLInputStream(object):
|
||||
# List of where new lines occur
|
||||
self.newLines = []
|
||||
|
||||
# Encoding Information
|
||||
self.charEncoding = encoding
|
||||
|
||||
# Raw Stream
|
||||
# Raw Stream
|
||||
self.rawStream = self.openStream(source)
|
||||
|
||||
# Try to detect the encoding of the stream by looking for a BOM
|
||||
detectedEncoding = self.detectEncoding()
|
||||
|
||||
# If an encoding was specified or detected from the BOM don't allow
|
||||
# the encoding to be changed futher into the stream
|
||||
if self.charEncoding or detectedEncoding:
|
||||
self.allowEncodingOverride = False
|
||||
else:
|
||||
self.allowEncodingOverride = True
|
||||
|
||||
# If an encoding wasn't specified, use the encoding detected from the
|
||||
# BOM, if present, otherwise use the default encoding
|
||||
if not self.charEncoding:
|
||||
self.charEncoding = detectedEncoding or "cp1252"
|
||||
# Encoding Information
|
||||
#Number of bytes to use when looking for a meta element with
|
||||
#encoding information
|
||||
self.numBytesMeta = 512
|
||||
#Encoding to use if no other information can be found
|
||||
self.defaultEncoding = "windows-1252"
|
||||
|
||||
#Autodetect encoding if no other information can be found?
|
||||
self.chardet = chardet
|
||||
|
||||
#Detect encoding iff no explicit "transport level" encoding is supplied
|
||||
if encoding is None or not isValidEncoding(encoding):
|
||||
encoding = self.detectEncoding()
|
||||
self.charEncoding = encoding
|
||||
|
||||
# Read bytes from stream decoding them into Unicode
|
||||
uString = self.rawStream.read().decode(self.charEncoding, 'replace')
|
||||
|
||||
# Normalize new lines and null characters
|
||||
# Normalize new ipythonlines and null characters
|
||||
uString = re.sub('\r\n?', '\n', uString)
|
||||
uString = re.sub('\x00', '\xFFFD', uString)
|
||||
uString = re.sub('\x00', u'\uFFFD', uString)
|
||||
|
||||
# Convert the unicode string into a list to be used as the data stream
|
||||
self.dataStream = uString
|
||||
@ -80,9 +80,39 @@ class HTMLInputStream(object):
|
||||
return stream
|
||||
|
||||
def detectEncoding(self):
|
||||
# Attempts to detect the character encoding of the stream. If
|
||||
# an encoding can be determined from the BOM return the name of the
|
||||
# encoding otherwise return None
|
||||
|
||||
#First look for a BOM
|
||||
#This will also read past the BOM if present
|
||||
encoding = self.detectBOM()
|
||||
#If there is no BOM need to look for meta elements with encoding
|
||||
#information
|
||||
if encoding is None:
|
||||
encoding = self.detectEncodingMeta()
|
||||
#Guess with chardet, if avaliable
|
||||
if encoding is None and self.chardet:
|
||||
try:
|
||||
import chardet
|
||||
buffer = self.rawStream.read()
|
||||
encoding = chardet.detect(buffer)['encoding']
|
||||
self.rawStream = self.openStream(buffer)
|
||||
except ImportError:
|
||||
pass
|
||||
# If all else fails use the default encoding
|
||||
if encoding is None:
|
||||
encoding = self.defaultEncoding
|
||||
|
||||
#Substitute for equivalent encodings:
|
||||
encodingSub = {"iso-8859-1":"windows-1252"}
|
||||
|
||||
if encoding.lower() in encodingSub:
|
||||
encoding = encodingSub[encoding.lower()]
|
||||
|
||||
return encoding
|
||||
|
||||
def detectBOM(self):
|
||||
"""Attempts to detect at BOM at the start of the stream. If
|
||||
an encoding can be determined from the BOM return the name of the
|
||||
encoding otherwise return None"""
|
||||
bomDict = {
|
||||
codecs.BOM_UTF8: 'utf-8',
|
||||
codecs.BOM_UTF16_LE: 'utf-16-le', codecs.BOM_UTF16_BE: 'utf-16-be',
|
||||
@ -103,24 +133,19 @@ class HTMLInputStream(object):
|
||||
encoding = bomDict.get(string) # UTF-32
|
||||
seek = 4
|
||||
|
||||
#AT - move this to the caller?
|
||||
# Set the read position past the BOM if one was found, otherwise
|
||||
# set it to the start of the stream
|
||||
self.rawStream.seek(encoding and seek or 0)
|
||||
|
||||
return encoding
|
||||
|
||||
def declareEncoding(self, encoding):
|
||||
def detectEncodingMeta(self):
|
||||
"""Report the encoding declared by the meta element
|
||||
|
||||
If the encoding is currently only guessed, then this
|
||||
will read subsequent characters in that encoding.
|
||||
|
||||
If the encoding is not compatible with the guessed encoding
|
||||
and non-US-ASCII characters have been seen, return True indicating
|
||||
parsing will have to begin again.
|
||||
|
||||
"""
|
||||
pass
|
||||
parser = EncodingParser(self.rawStream.read(self.numBytesMeta))
|
||||
self.rawStream.seek(0)
|
||||
return parser.getEncoding()
|
||||
|
||||
def determineNewLines(self):
|
||||
# Looks through the stream to find where new lines occur so
|
||||
@ -188,15 +213,277 @@ class HTMLInputStream(object):
|
||||
self.queue.insert(0, charStack.pop())
|
||||
return "".join(charStack)
|
||||
|
||||
if __name__ == "__main__":
|
||||
stream = HTMLInputStream("../tests/utf-8-bom.html")
|
||||
|
||||
c = stream.char()
|
||||
while c:
|
||||
line, col = stream.position()
|
||||
if c == u"\n":
|
||||
print "Line %s, Column %s: Line Feed" % (line, col)
|
||||
class EncodingBytes(str):
|
||||
"""String-like object with an assosiated position and various extra methods
|
||||
If the position is ever greater than the string length then an exception is
|
||||
raised"""
|
||||
def __init__(self, value):
|
||||
str.__init__(self, value)
|
||||
self._position=-1
|
||||
|
||||
def __iter__(self):
|
||||
return self
|
||||
|
||||
def next(self):
|
||||
self._position += 1
|
||||
rv = self[self.position]
|
||||
return rv
|
||||
|
||||
def setPosition(self, position):
|
||||
if self._position >= len(self):
|
||||
raise StopIteration
|
||||
self._position = position
|
||||
|
||||
def getPosition(self):
|
||||
if self._position >= len(self):
|
||||
raise StopIteration
|
||||
if self._position >= 0:
|
||||
return self._position
|
||||
else:
|
||||
print "Line %s, Column %s: %s" % (line, col, c.encode('utf-8'))
|
||||
c = stream.char()
|
||||
print "EOF"
|
||||
return None
|
||||
|
||||
position = property(getPosition, setPosition)
|
||||
|
||||
def getCurrentByte(self):
|
||||
return self[self.position]
|
||||
|
||||
currentByte = property(getCurrentByte)
|
||||
|
||||
def skip(self, chars=spaceCharacters):
|
||||
"""Skip past a list of characters"""
|
||||
while self.currentByte in chars:
|
||||
self.position += 1
|
||||
|
||||
def matchBytes(self, bytes, lower=False):
|
||||
"""Look for a sequence of bytes at the start of a string. If the bytes
|
||||
are found return True and advance the position to the byte after the
|
||||
match. Otherwise return False and leave the position alone"""
|
||||
data = self[self.position:self.position+len(bytes)]
|
||||
if lower:
|
||||
data = data.lower()
|
||||
rv = data.startswith(bytes)
|
||||
if rv == True:
|
||||
self.position += len(bytes)
|
||||
return rv
|
||||
|
||||
def jumpTo(self, bytes):
|
||||
"""Look for the next sequence of bytes matching a given sequence. If
|
||||
a match is found advance the position to the last byte of the match"""
|
||||
newPosition = self[self.position:].find(bytes)
|
||||
if newPosition > -1:
|
||||
self._position += (newPosition + len(bytes)-1)
|
||||
return True
|
||||
else:
|
||||
raise StopIteration
|
||||
|
||||
def findNext(self, byteList):
|
||||
"""Move the pointer so it points to the next byte in a set of possible
|
||||
bytes"""
|
||||
while (self.currentByte not in byteList):
|
||||
self.position += 1
|
||||
|
||||
class EncodingParser(object):
|
||||
"""Mini parser for detecting character encoding from meta elements"""
|
||||
|
||||
def __init__(self, data):
|
||||
"""string - the data to work on for encoding detection"""
|
||||
self.data = EncodingBytes(data)
|
||||
self.encoding = None
|
||||
|
||||
def getEncoding(self):
|
||||
methodDispatch = (
|
||||
("<!--",self.handleComment),
|
||||
("<meta",self.handleMeta),
|
||||
("</",self.handlePossibleEndTag),
|
||||
("<!",self.handleOther),
|
||||
("<?",self.handleOther),
|
||||
("<",self.handlePossibleStartTag))
|
||||
for byte in self.data:
|
||||
keepParsing = True
|
||||
for key, method in methodDispatch:
|
||||
if self.data.matchBytes(key, lower=True):
|
||||
try:
|
||||
keepParsing = method()
|
||||
break
|
||||
except StopIteration:
|
||||
keepParsing=False
|
||||
break
|
||||
if not keepParsing:
|
||||
break
|
||||
if self.encoding is not None:
|
||||
self.encoding = self.encoding.strip()
|
||||
return self.encoding
|
||||
|
||||
def handleComment(self):
|
||||
"""Skip over comments"""
|
||||
return self.data.jumpTo("-->")
|
||||
|
||||
def handleMeta(self):
|
||||
if self.data.currentByte not in spaceCharacters:
|
||||
#if we have <meta not followed by a space so just keep going
|
||||
return True
|
||||
#We have a valid meta element we want to search for attributes
|
||||
while True:
|
||||
#Try to find the next attribute after the current position
|
||||
attr = self.getAttribute()
|
||||
if attr is None:
|
||||
return True
|
||||
else:
|
||||
if attr[0] == "charset":
|
||||
tentativeEncoding = attr[1]
|
||||
if isValidEncoding(tentativeEncoding):
|
||||
self.encoding = tentativeEncoding
|
||||
return False
|
||||
elif attr[0] == "content":
|
||||
contentParser = ContentAttrParser(EncodingBytes(attr[1]))
|
||||
tentativeEncoding = contentParser.parse()
|
||||
if isValidEncoding(tentativeEncoding):
|
||||
self.encoding = tentativeEncoding
|
||||
return False
|
||||
|
||||
def handlePossibleStartTag(self):
|
||||
return self.handlePossibleTag(False)
|
||||
|
||||
def handlePossibleEndTag(self):
|
||||
self.data.position+=1
|
||||
return self.handlePossibleTag(True)
|
||||
|
||||
def handlePossibleTag(self, endTag):
|
||||
if self.data.currentByte not in asciiLetters:
|
||||
#If the next byte is not an ascii letter either ignore this
|
||||
#fragment (possible start tag case) or treat it according to
|
||||
#handleOther
|
||||
if endTag:
|
||||
self.data.position -= 1
|
||||
self.handleOther()
|
||||
return True
|
||||
|
||||
self.data.findNext(list(spaceCharacters) + ["<", ">"])
|
||||
if self.data.currentByte == "<":
|
||||
#return to the first step in the overall "two step" algorithm
|
||||
#reprocessing the < byte
|
||||
self.data.position -= 1
|
||||
else:
|
||||
#Read all attributes
|
||||
attr = self.getAttribute()
|
||||
while attr is not None:
|
||||
attr = self.getAttribute()
|
||||
return True
|
||||
|
||||
def handleOther(self):
|
||||
return self.data.jumpTo(">")
|
||||
|
||||
def getAttribute(self):
|
||||
"""Return a name,value pair for the next attribute in the stream,
|
||||
if one is found, or None"""
|
||||
self.data.skip(list(spaceCharacters)+["/"])
|
||||
if self.data.currentByte == "<":
|
||||
self.data.position -= 1
|
||||
return None
|
||||
elif self.data.currentByte == ">":
|
||||
return None
|
||||
attrName = []
|
||||
attrValue = []
|
||||
spaceFound = False
|
||||
#Step 5 attribute name
|
||||
while True:
|
||||
if self.data.currentByte == "=" and attrName:
|
||||
break
|
||||
elif self.data.currentByte in spaceCharacters:
|
||||
spaceFound=True
|
||||
break
|
||||
elif self.data.currentByte in ("/", "<", ">"):
|
||||
return "".join(attrName), ""
|
||||
elif self.data.currentByte in asciiUppercase:
|
||||
attrName.extend(self.data.currentByte.lower())
|
||||
else:
|
||||
attrName.extend(self.data.currentByte)
|
||||
#Step 6
|
||||
self.data.position += 1
|
||||
#Step 7
|
||||
if spaceFound:
|
||||
self.data.skip()
|
||||
#Step 8
|
||||
if self.data.currentByte != "=":
|
||||
self.data.position -= 1
|
||||
return "".join(attrName), ""
|
||||
#XXX need to advance position in both spaces and value case
|
||||
#Step 9
|
||||
self.data.position += 1
|
||||
#Step 10
|
||||
self.data.skip()
|
||||
#Step 11
|
||||
if self.data.currentByte in ("'", '"'):
|
||||
#11.1
|
||||
quoteChar = self.data.currentByte
|
||||
while True:
|
||||
self.data.position+=1
|
||||
#11.3
|
||||
if self.data.currentByte == quoteChar:
|
||||
self.data.position += 1
|
||||
return "".join(attrName), "".join(attrValue)
|
||||
#11.4
|
||||
elif self.data.currentByte in asciiUppercase:
|
||||
attrValue.extend(self.data.currentByte.lower())
|
||||
#11.5
|
||||
else:
|
||||
attrValue.extend(self.data.currentByte)
|
||||
elif self.data.currentByte in (">", '<'):
|
||||
return "".join(attrName), ""
|
||||
elif self.data.currentByte in asciiUppercase:
|
||||
attrValue.extend(self.data.currentByte.lower())
|
||||
else:
|
||||
attrValue.extend(self.data.currentByte)
|
||||
while True:
|
||||
self.data.position +=1
|
||||
if self.data.currentByte in (
|
||||
list(spaceCharacters) + [">", '<']):
|
||||
return "".join(attrName), "".join(attrValue)
|
||||
elif self.data.currentByte in asciiUppercase:
|
||||
attrValue.extend(self.data.currentByte.lower())
|
||||
else:
|
||||
attrValue.extend(self.data.currentByte)
|
||||
|
||||
|
||||
class ContentAttrParser(object):
|
||||
def __init__(self, data):
|
||||
self.data = data
|
||||
def parse(self):
|
||||
try:
|
||||
#Skip to the first ";"
|
||||
self.data.jumpTo(";")
|
||||
self.data.position += 1
|
||||
self.data.skip()
|
||||
#Check if the attr name is charset
|
||||
#otherwise return
|
||||
self.data.jumpTo("charset")
|
||||
self.data.position += 1
|
||||
self.data.skip()
|
||||
if not self.data.currentByte == "=":
|
||||
#If there is no = sign keep looking for attrs
|
||||
return None
|
||||
self.data.position += 1
|
||||
self.data.skip()
|
||||
#Look for an encoding between matching quote marks
|
||||
if self.data.currentByte in ('"', "'"):
|
||||
quoteMark = self.data.currentByte
|
||||
self.data.position += 1
|
||||
oldPosition = self.data.position
|
||||
self.data.jumpTo(quoteMark)
|
||||
return self.data[oldPosition:self.data.position]
|
||||
else:
|
||||
#Unquoted value
|
||||
oldPosition = self.data.position
|
||||
try:
|
||||
self.data.findNext(spaceCharacters)
|
||||
return self.data[oldPosition:self.data.position]
|
||||
except StopIteration:
|
||||
#Return the whole remaining value
|
||||
return self.data[oldPosition:]
|
||||
except StopIteration:
|
||||
return None
|
||||
|
||||
def isValidEncoding(encoding):
|
||||
"""Determine if a string is a supported encoding"""
|
||||
return (encoding is not None and type(encoding) == types.StringType and
|
||||
encoding.lower().strip() in encodings)
|
||||
|
@ -111,10 +111,6 @@ class XmlElementPhase(html5parser.Phase):
|
||||
def endTagOther(self, name):
|
||||
for node in self.tree.openElements[::-1]:
|
||||
if node.name == name:
|
||||
self.tree.generateImpliedEndTags()
|
||||
if self.tree.openElements[-1].name != name:
|
||||
self.parser.parseError(_("Unexpected end tag " + name +\
|
||||
"."))
|
||||
while self.tree.openElements.pop() != node:
|
||||
pass
|
||||
break
|
||||
|
@ -303,9 +303,8 @@ class TreeBuilder(object):
|
||||
if (name in frozenset(("dd", "dt", "li", "p", "td", "th", "tr"))
|
||||
and name != exclude):
|
||||
self.openElements.pop()
|
||||
# XXX Until someone has broven that the above breaks stuff I think
|
||||
# we should keep it in.
|
||||
# self.processEndTag(name)
|
||||
# XXX This is not entirely what the specification says. We should
|
||||
# investigate it more closely.
|
||||
self.generateImpliedEndTags(exclude)
|
||||
|
||||
def getDocument(self):
|
||||
|
@ -1,7 +1,10 @@
|
||||
try:
|
||||
from xml.etree import ElementTree
|
||||
except ImportError:
|
||||
from elementtree import ElementTree
|
||||
try:
|
||||
from elementtree import ElementTree
|
||||
except:
|
||||
pass
|
||||
|
||||
import _base
|
||||
|
||||
|
@ -158,14 +158,21 @@ def content(xentry, name, detail, bozo):
|
||||
for div in body.childNodes:
|
||||
if div.nodeType != Node.ELEMENT_NODE: continue
|
||||
if div.nodeName != 'div': continue
|
||||
div.normalize()
|
||||
if len(div.childNodes) == 1 and \
|
||||
div.firstChild.nodeType == Node.TEXT_NODE:
|
||||
data = div.firstChild
|
||||
else:
|
||||
data = div
|
||||
xcontent.setAttribute('type', 'xhtml')
|
||||
break
|
||||
try:
|
||||
div.normalize()
|
||||
if len(div.childNodes) == 1 and \
|
||||
div.firstChild.nodeType == Node.TEXT_NODE:
|
||||
data = div.firstChild
|
||||
else:
|
||||
data = div
|
||||
xcontent.setAttribute('type', 'xhtml')
|
||||
break
|
||||
except:
|
||||
# in extremely nested cases, the Python runtime decides
|
||||
# that normalize() must be in an infinite loop; mark
|
||||
# the content as escaped html and proceed on...
|
||||
xcontent.setAttribute('type', 'html')
|
||||
data = xdoc.createTextNode(detail.value.decode('utf-8'))
|
||||
|
||||
if data: xcontent.appendChild(data)
|
||||
|
||||
|
@ -99,7 +99,7 @@ def scrub(feed_uri, data):
|
||||
# resolve relative URIs and sanitize
|
||||
for entry in data.entries + [data.feed]:
|
||||
for key in entry.keys():
|
||||
if key == 'content':
|
||||
if key == 'content'and not entry.has_key('content_detail'):
|
||||
node = entry.content[0]
|
||||
elif key.endswith('_detail'):
|
||||
node = entry[key]
|
||||
|
48
planet/shell/dj.py
Normal file
48
planet/shell/dj.py
Normal file
@ -0,0 +1,48 @@
|
||||
import os.path
|
||||
import urlparse
|
||||
import datetime
|
||||
|
||||
import tmpl
|
||||
from planet import config
|
||||
|
||||
def DjangoPlanetDate(value):
|
||||
return datetime.datetime(*value[:6])
|
||||
|
||||
# remap PlanetDate to be a datetime, so Django template authors can use
|
||||
# the "date" filter on these values
|
||||
tmpl.PlanetDate = DjangoPlanetDate
|
||||
|
||||
def run(script, doc, output_file=None, options={}):
|
||||
"""process a Django template file"""
|
||||
|
||||
# this is needed to use the Django template system as standalone
|
||||
# I need to re-import the settings at every call because I have to
|
||||
# set the TEMPLATE_DIRS variable programmatically
|
||||
from django.conf import settings
|
||||
try:
|
||||
settings.configure(
|
||||
DEBUG=True, TEMPLATE_DEBUG=True,
|
||||
TEMPLATE_DIRS=(os.path.dirname(script),)
|
||||
)
|
||||
except EnvironmentError:
|
||||
pass
|
||||
from django.template import Context
|
||||
from django.template.loader import get_template
|
||||
|
||||
# set up the Django context by using the default htmltmpl
|
||||
# datatype converters
|
||||
context = Context()
|
||||
context.update(tmpl.template_info(doc))
|
||||
context['Config'] = config.planet_options()
|
||||
t = get_template(script)
|
||||
|
||||
if output_file:
|
||||
reluri = os.path.splitext(os.path.basename(output_file))[0]
|
||||
context['url'] = urlparse.urljoin(config.link(),reluri)
|
||||
f = open(output_file, 'w')
|
||||
f.write(t.render(context))
|
||||
f.close()
|
||||
else:
|
||||
# @@this is useful for testing purposes, but does it
|
||||
# belong here?
|
||||
return t.render(context)
|
@ -194,7 +194,9 @@ def writeCache(feed_uri, feed_info, data):
|
||||
for filter in config.filters(feed_uri):
|
||||
output = shell.run(filter, output, mode="filter")
|
||||
if not output: break
|
||||
if not output: continue
|
||||
if not output:
|
||||
if os.path.exists(cache_file): os.remove(cache_file)
|
||||
continue
|
||||
|
||||
# write out and timestamp the results
|
||||
write(output, cache_file)
|
||||
|
@ -67,6 +67,8 @@ def splice():
|
||||
|
||||
# insert entry information
|
||||
items = 0
|
||||
count = {}
|
||||
new_feed_items = config.new_feed_items()
|
||||
for mtime,file in dir:
|
||||
if index != None:
|
||||
base = os.path.basename(file)
|
||||
@ -75,15 +77,23 @@ def splice():
|
||||
try:
|
||||
entry=minidom.parse(file)
|
||||
|
||||
# verify that this entry is currently subscribed to
|
||||
# verify that this entry is currently subscribed to and that the
|
||||
# number of entries contributed by this feed does not exceed
|
||||
# config.new_feed_items
|
||||
entry.normalize()
|
||||
sources = entry.getElementsByTagName('source')
|
||||
if sources:
|
||||
ids = sources[0].getElementsByTagName('id')
|
||||
if ids and ids[0].childNodes[0].nodeValue not in sub_ids:
|
||||
ids = sources[0].getElementsByTagName('planet:id')
|
||||
if not ids: continue
|
||||
if ids[0].childNodes[0].nodeValue not in sub_ids: continue
|
||||
if ids:
|
||||
id = ids[0].childNodes[0].nodeValue
|
||||
count[id] = count.get(id,0) + 1
|
||||
if new_feed_items and count[id] > new_feed_items: continue
|
||||
|
||||
if id not in sub_ids:
|
||||
ids = sources[0].getElementsByTagName('planet:id')
|
||||
if not ids: continue
|
||||
id = ids[0].childNodes[0].nodeValue
|
||||
if id not in sub_ids: continue
|
||||
|
||||
# add entry to feed
|
||||
feed.appendChild(entry.documentElement)
|
||||
|
20
tests/data/expunge/config.ini
Normal file
20
tests/data/expunge/config.ini
Normal file
@ -0,0 +1,20 @@
|
||||
[Planet]
|
||||
name = test planet
|
||||
cache_directory = tests/work/expunge/cache
|
||||
cache_keep_entries = 1
|
||||
|
||||
[tag:bzr.mfd-consult.dk,2007:venus-expunge-testfeed1]
|
||||
name = no source
|
||||
|
||||
[tag:bzr.mfd-consult.dk,2007:venus-expunge-testfeed2]
|
||||
name = no source id
|
||||
|
||||
[tag:bzr.mfd-consult.dk,2007:venus-expunge-testfeed3]
|
||||
name = global setting
|
||||
|
||||
[tag:bzr.mfd-consult.dk,2007:venus-expunge-testfeed4]
|
||||
name = local setting
|
||||
cache_keep_entries = 2
|
||||
|
||||
#[tag:bzr.mfd-consult.dk,2007:venus-expunge-testfeed5]
|
||||
#name = unsubbed
|
8
tests/data/expunge/test1.entry
Normal file
8
tests/data/expunge/test1.entry
Normal file
@ -0,0 +1,8 @@
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<entry xmlns="http://www.w3.org/2005/Atom">
|
||||
<id>tag:bzr.mfd-consult.dk,2007:venus-expunge-test1/1</id>
|
||||
<link href="http://example.com/1/1"/>
|
||||
<title>Test 1/1</title>
|
||||
<content>Entry with missing source</content>
|
||||
<updated>2007-03-01T01:01:00Z</updated>
|
||||
</entry>
|
11
tests/data/expunge/test2.entry
Normal file
11
tests/data/expunge/test2.entry
Normal file
@ -0,0 +1,11 @@
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<entry xmlns="http://www.w3.org/2005/Atom">
|
||||
<id>tag:bzr.mfd-consult.dk,2007:venus-expunge-test2/1</id>
|
||||
<link href="http://example.com/2/1"/>
|
||||
<title>Test 2/1</title>
|
||||
<content>Entry with missing source id</content>
|
||||
<updated>2007-03-01T02:01:00Z</updated>
|
||||
<source>
|
||||
<title>Test 2/1 source</title>
|
||||
</source>
|
||||
</entry>
|
12
tests/data/expunge/test3a.entry
Normal file
12
tests/data/expunge/test3a.entry
Normal file
@ -0,0 +1,12 @@
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<entry xmlns="http://www.w3.org/2005/Atom">
|
||||
<id>tag:bzr.mfd-consult.dk,2007:venus-expunge-test3/1</id>
|
||||
<link href="http://example.com/3/1"/>
|
||||
<title>Test 3/1</title>
|
||||
<content>Entry for global setting 1</content>
|
||||
<updated>2007-03-01T03:01:00Z</updated>
|
||||
<source>
|
||||
<id>tag:bzr.mfd-consult.dk,2007:venus-expunge-testfeed3</id>
|
||||
<title>Test 3 source</title>
|
||||
</source>
|
||||
</entry>
|
12
tests/data/expunge/test3b.entry
Normal file
12
tests/data/expunge/test3b.entry
Normal file
@ -0,0 +1,12 @@
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<entry xmlns="http://www.w3.org/2005/Atom">
|
||||
<id>tag:bzr.mfd-consult.dk,2007:venus-expunge-test3/2</id>
|
||||
<link href="http://example.com/3/2"/>
|
||||
<title>Test 3/2</title>
|
||||
<content>Entry for global setting 2</content>
|
||||
<updated>2007-03-01T03:02:00Z</updated>
|
||||
<source>
|
||||
<id>tag:bzr.mfd-consult.dk,2007:venus-expunge-testfeed3</id>
|
||||
<title>Test 3 source</title>
|
||||
</source>
|
||||
</entry>
|
12
tests/data/expunge/test3c.entry
Normal file
12
tests/data/expunge/test3c.entry
Normal file
@ -0,0 +1,12 @@
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<entry xmlns="http://www.w3.org/2005/Atom">
|
||||
<id>tag:bzr.mfd-consult.dk,2007:venus-expunge-test3/3</id>
|
||||
<link href="http://example.com/3/3"/>
|
||||
<title>Test 3/3</title>
|
||||
<content>Entry for global setting 3</content>
|
||||
<updated>2007-03-01T03:03:00Z</updated>
|
||||
<source>
|
||||
<id>tag:bzr.mfd-consult.dk,2007:venus-expunge-testfeed3</id>
|
||||
<title>Test 3 source</title>
|
||||
</source>
|
||||
</entry>
|
12
tests/data/expunge/test4a.entry
Normal file
12
tests/data/expunge/test4a.entry
Normal file
@ -0,0 +1,12 @@
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<entry xmlns="http://www.w3.org/2005/Atom">
|
||||
<id>tag:bzr.mfd-consult.dk,2007:venus-expunge-test4/1</id>
|
||||
<link href="http://example.com/4/1"/>
|
||||
<title>Test 4/1</title>
|
||||
<content>Entry for local setting 1</content>
|
||||
<updated>2007-03-01T04:01:00Z</updated>
|
||||
<source>
|
||||
<id>tag:bzr.mfd-consult.dk,2007:venus-expunge-testfeed4</id>
|
||||
<title>Test 4 source</title>
|
||||
</source>
|
||||
</entry>
|
12
tests/data/expunge/test4b.entry
Normal file
12
tests/data/expunge/test4b.entry
Normal file
@ -0,0 +1,12 @@
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<entry xmlns="http://www.w3.org/2005/Atom">
|
||||
<id>tag:bzr.mfd-consult.dk,2007:venus-expunge-test4/2</id>
|
||||
<link href="http://example.com/4/2"/>
|
||||
<title>Test 4/2</title>
|
||||
<content>Entry for local setting 2</content>
|
||||
<updated>2007-03-01T04:02:00Z</updated>
|
||||
<source>
|
||||
<id>tag:bzr.mfd-consult.dk,2007:venus-expunge-testfeed4</id>
|
||||
<title>Test 4 source</title>
|
||||
</source>
|
||||
</entry>
|
12
tests/data/expunge/test4c.entry
Normal file
12
tests/data/expunge/test4c.entry
Normal file
@ -0,0 +1,12 @@
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<entry xmlns="http://www.w3.org/2005/Atom">
|
||||
<id>tag:bzr.mfd-consult.dk,2007:venus-expunge-test4/3</id>
|
||||
<link href="http://example.com/4/3"/>
|
||||
<title>Test 4/3</title>
|
||||
<content>Entry for local setting 3</content>
|
||||
<updated>2007-03-01T04:03:00Z</updated>
|
||||
<source>
|
||||
<id>tag:bzr.mfd-consult.dk,2007:venus-expunge-testfeed4</id>
|
||||
<title>Test 4 source</title>
|
||||
</source>
|
||||
</entry>
|
12
tests/data/expunge/test5.entry
Normal file
12
tests/data/expunge/test5.entry
Normal file
@ -0,0 +1,12 @@
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<entry xmlns="http://www.w3.org/2005/Atom">
|
||||
<id>tag:bzr.mfd-consult.dk,2007:venus-expunge-test5/1</id>
|
||||
<link href="http://example.com/5/1"/>
|
||||
<title>Test 5/1</title>
|
||||
<content>Entry from unsubbed feed</content>
|
||||
<updated>2007-03-01T05:01:00Z</updated>
|
||||
<source>
|
||||
<id>tag:bzr.mfd-consult.dk,2007:venus-expunge-testfeed5</id>
|
||||
<title>Test 5 source</title>
|
||||
</source>
|
||||
</entry>
|
5
tests/data/expunge/testfeed1.atom
Normal file
5
tests/data/expunge/testfeed1.atom
Normal file
@ -0,0 +1,5 @@
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<feed xmlns="http://www.w3.org/2005/Atom">
|
||||
<link rel="self" href="http://bzr.mfd-consult.dk/venus/tests/data/expunge/testfeed1.atom"/>
|
||||
<id>tag:bzr.mfd-consult.dk,2007:venus-expunge-testfeed1</id>
|
||||
</feed>
|
5
tests/data/expunge/testfeed2.atom
Normal file
5
tests/data/expunge/testfeed2.atom
Normal file
@ -0,0 +1,5 @@
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<feed xmlns="http://www.w3.org/2005/Atom">
|
||||
<link rel="self" href="http://bzr.mfd-consult.dk/venus/tests/data/expunge/testfeed2.atom"/>
|
||||
<id>tag:bzr.mfd-consult.dk,2007:venus-expunge-testfeed2</id>
|
||||
</feed>
|
5
tests/data/expunge/testfeed3.atom
Normal file
5
tests/data/expunge/testfeed3.atom
Normal file
@ -0,0 +1,5 @@
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<feed xmlns="http://www.w3.org/2005/Atom">
|
||||
<link rel="self" href="http://bzr.mfd-consult.dk/venus/tests/data/expunge/testfeed3.atom"/>
|
||||
<id>tag:bzr.mfd-consult.dk,2007:venus-expunge-testfeed3</id>
|
||||
</feed>
|
5
tests/data/expunge/testfeed4.atom
Normal file
5
tests/data/expunge/testfeed4.atom
Normal file
@ -0,0 +1,5 @@
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<feed xmlns="http://www.w3.org/2005/Atom">
|
||||
<link rel="self" href="http://bzr.mfd-consult.dk/venus/tests/data/expunge/testfeed4.atom"/>
|
||||
<id>tag:bzr.mfd-consult.dk,2007:venus-expunge-testfeed4</id>
|
||||
</feed>
|
1
tests/data/filter/django/config.html.dj
Normal file
1
tests/data/filter/django/config.html.dj
Normal file
@ -0,0 +1 @@
|
||||
{{ Config.name }}
|
2
tests/data/filter/django/test.ini
Normal file
2
tests/data/filter/django/test.ini
Normal file
@ -0,0 +1,2 @@
|
||||
[Planet]
|
||||
name: Django on Venus
|
20
tests/data/filter/django/test.xml
Normal file
20
tests/data/filter/django/test.xml
Normal file
@ -0,0 +1,20 @@
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<feed xmlns="http://www.w3.org/2005/Atom">
|
||||
|
||||
<title>Example Feed</title>
|
||||
<link href="http://example.org/"/>
|
||||
<updated>2003-12-13T18:30:02Z</updated>
|
||||
<author>
|
||||
<name>John Doe</name>
|
||||
</author>
|
||||
<id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>
|
||||
|
||||
<entry>
|
||||
<title>Atom-Powered Robots Run Amok</title>
|
||||
<link href="http://example.org/2003/12/13/atom03"/>
|
||||
<id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
|
||||
<updated>2003-12-13T18:30:02Z</updated>
|
||||
<summary>Some text.</summary>
|
||||
</entry>
|
||||
|
||||
</feed>
|
1
tests/data/filter/django/title.html.dj
Normal file
1
tests/data/filter/django/title.html.dj
Normal file
@ -0,0 +1 @@
|
||||
{% for item in Items %}{{ item.title }}{% endfor %}
|
2
tests/data/filter/regexp-sifter.ini
Normal file
2
tests/data/filter/regexp-sifter.ini
Normal file
@ -0,0 +1,2 @@
|
||||
[Planet]
|
||||
filter=two
|
36
tests/data/reconstitute/stack_overflow.xml
Normal file
36
tests/data/reconstitute/stack_overflow.xml
Normal file
@ -0,0 +1,36 @@
|
||||
<!--
|
||||
Description: content with extremely nested markup
|
||||
Expect: content[0].type == 'text/html'
|
||||
-->
|
||||
|
||||
<feed xmlns="http://www.w3.org/2005/Atom">
|
||||
<entry>
|
||||
<content type="html">
|
||||
<![CDATA[
|
||||
<span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span>
|
||||
<span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span>
|
||||
<span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span>
|
||||
<span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span>
|
||||
<span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span>
|
||||
<span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span>
|
||||
<span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span>
|
||||
<span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span>
|
||||
<span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span>
|
||||
<span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span>
|
||||
<span><span><span><span><span><span><span><span>
|
||||
|
||||
Stack overflow
|
||||
|
||||
</span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span>
|
||||
</span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span>
|
||||
</span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span>
|
||||
</span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span>
|
||||
</span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span>
|
||||
</span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span>
|
||||
</span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span>
|
||||
</span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span>
|
||||
</span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span>
|
||||
</span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span>
|
||||
</span></span></span>]]></content>
|
||||
</entry>
|
||||
</feed>
|
20
tests/test_docs.py
Normal file
20
tests/test_docs.py
Normal file
@ -0,0 +1,20 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
import unittest, os
|
||||
from xml.dom import minidom
|
||||
from glob import glob
|
||||
|
||||
class DocsTest(unittest.TestCase):
|
||||
|
||||
def test_well_formed(self):
|
||||
for doc in glob('docs/*'):
|
||||
if os.path.isdir(doc): continue
|
||||
if doc.endswith('.css') or doc.endswith('.js'): continue
|
||||
|
||||
try:
|
||||
minidom.parse(doc)
|
||||
except:
|
||||
self.fail('Not well formed: ' + doc);
|
||||
break
|
||||
else:
|
||||
self.assertTrue(True);
|
83
tests/test_expunge.py
Normal file
83
tests/test_expunge.py
Normal file
@ -0,0 +1,83 @@
|
||||
#!/usr/bin/env python
|
||||
import unittest, os, glob, shutil, time
|
||||
from planet.spider import filename
|
||||
from planet import feedparser, config
|
||||
from planet.expunge import expungeCache
|
||||
from xml.dom import minidom
|
||||
import planet
|
||||
|
||||
workdir = 'tests/work/expunge/cache'
|
||||
sourcesdir = 'tests/work/expunge/cache/sources'
|
||||
testentries = 'tests/data/expunge/test*.entry'
|
||||
testfeeds = 'tests/data/expunge/test*.atom'
|
||||
configfile = 'tests/data/expunge/config.ini'
|
||||
|
||||
class ExpungeTest(unittest.TestCase):
|
||||
def setUp(self):
|
||||
# silence errors
|
||||
planet.logger = None
|
||||
planet.getLogger('CRITICAL',None)
|
||||
|
||||
try:
|
||||
os.makedirs(workdir)
|
||||
os.makedirs(sourcesdir)
|
||||
except:
|
||||
self.tearDown()
|
||||
os.makedirs(workdir)
|
||||
os.makedirs(sourcesdir)
|
||||
|
||||
def tearDown(self):
|
||||
shutil.rmtree(workdir)
|
||||
os.removedirs(os.path.split(workdir)[0])
|
||||
|
||||
def test_expunge(self):
|
||||
config.load(configfile)
|
||||
|
||||
# create test entries in cache with correct timestamp
|
||||
for entry in glob.glob(testentries):
|
||||
e=minidom.parse(entry)
|
||||
e.normalize()
|
||||
eid = e.getElementsByTagName('id')
|
||||
efile = filename(workdir, eid[0].childNodes[0].nodeValue)
|
||||
eupdated = e.getElementsByTagName('updated')[0].childNodes[0].nodeValue
|
||||
emtime = time.mktime(feedparser._parse_date_w3dtf(eupdated))
|
||||
if not eid or not eupdated: continue
|
||||
shutil.copyfile(entry, efile)
|
||||
os.utime(efile, (emtime, emtime))
|
||||
|
||||
# create test feeds in cache
|
||||
sources = config.cache_sources_directory()
|
||||
for feed in glob.glob(testfeeds):
|
||||
f=minidom.parse(feed)
|
||||
f.normalize()
|
||||
fid = f.getElementsByTagName('id')
|
||||
if not fid: continue
|
||||
ffile = filename(sources, fid[0].childNodes[0].nodeValue)
|
||||
shutil.copyfile(feed, ffile)
|
||||
|
||||
# verify that exactly nine entries + one source dir were produced
|
||||
files = glob.glob(workdir+"/*")
|
||||
self.assertEqual(10, len(files))
|
||||
|
||||
# verify that exactly four feeds were produced in source dir
|
||||
files = glob.glob(sources+"/*")
|
||||
self.assertEqual(4, len(files))
|
||||
|
||||
# expunge...
|
||||
expungeCache()
|
||||
|
||||
# verify that five entries and one source dir are left
|
||||
files = glob.glob(workdir+"/*")
|
||||
self.assertEqual(6, len(files))
|
||||
|
||||
# verify that the right five entries are left
|
||||
self.assertTrue(os.path.join(workdir,
|
||||
'bzr.mfd-consult.dk,2007,venus-expunge-test1,1') in files)
|
||||
self.assertTrue(os.path.join(workdir,
|
||||
'bzr.mfd-consult.dk,2007,venus-expunge-test2,1') in files)
|
||||
self.assertTrue(os.path.join(workdir,
|
||||
'bzr.mfd-consult.dk,2007,venus-expunge-test3,3') in files)
|
||||
self.assertTrue(os.path.join(workdir,
|
||||
'bzr.mfd-consult.dk,2007,venus-expunge-test4,2') in files)
|
||||
self.assertTrue(os.path.join(workdir,
|
||||
'bzr.mfd-consult.dk,2007,venus-expunge-test4,3') in files)
|
43
tests/test_filter_django.py
Normal file
43
tests/test_filter_django.py
Normal file
@ -0,0 +1,43 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
import os.path
|
||||
import unittest, xml.dom.minidom, datetime
|
||||
|
||||
from planet import config, logger
|
||||
from planet.shell import dj
|
||||
|
||||
class DjangoFilterTests(unittest.TestCase):
|
||||
|
||||
def test_django_filter(self):
|
||||
config.load('tests/data/filter/django/test.ini')
|
||||
results = dj.tmpl.template_info("<feed/>")
|
||||
self.assertEqual(results['name'], 'Django on Venus')
|
||||
|
||||
def test_django_date_type(self):
|
||||
config.load('tests/data/filter/django/test.ini')
|
||||
results = dj.tmpl.template_info("<feed/>")
|
||||
self.assertEqual(type(results['date']), datetime.datetime)
|
||||
|
||||
def test_django_item_title(self):
|
||||
config.load('tests/data/filter/django/test.ini')
|
||||
feed = open('tests/data/filter/django/test.xml')
|
||||
input = feed.read(); feed.close()
|
||||
results = dj.run(
|
||||
os.path.realpath('tests/data/filter/django/title.html.dj'), input)
|
||||
self.assertEqual(results, "Atom-Powered Robots Run Amok\n")
|
||||
|
||||
def test_django_config_context(self):
|
||||
config.load('tests/data/filter/django/test.ini')
|
||||
feed = open('tests/data/filter/django/test.xml')
|
||||
input = feed.read(); feed.close()
|
||||
results = dj.run(
|
||||
os.path.realpath('tests/data/filter/django/config.html.dj'), input)
|
||||
self.assertEqual(results, "Django on Venus\n")
|
||||
|
||||
|
||||
try:
|
||||
from django.conf import settings
|
||||
except ImportError:
|
||||
logger.warn("Django is not available => can't test django filters")
|
||||
for method in dir(DjangoFilterTests):
|
||||
if method.startswith('test_'): delattr(DjangoFilterTests,method)
|
@ -89,14 +89,40 @@ class FilterTests(unittest.TestCase):
|
||||
|
||||
self.assertNotEqual('', output)
|
||||
|
||||
def test_regexp_filter(self):
|
||||
config.load('tests/data/filter/regexp-sifter.ini')
|
||||
|
||||
testfile = 'tests/data/filter/category-one.xml'
|
||||
|
||||
output = open(testfile).read()
|
||||
for filter in config.filters():
|
||||
output = shell.run(filter, output, mode="filter")
|
||||
|
||||
self.assertEqual('', output)
|
||||
|
||||
testfile = 'tests/data/filter/category-two.xml'
|
||||
|
||||
output = open(testfile).read()
|
||||
for filter in config.filters():
|
||||
output = shell.run(filter, output, mode="filter")
|
||||
|
||||
self.assertNotEqual('', output)
|
||||
|
||||
try:
|
||||
from subprocess import Popen, PIPE
|
||||
|
||||
sed=Popen(['sed','--version'],stdout=PIPE,stderr=PIPE)
|
||||
sed.communicate()
|
||||
if sed.returncode != 0:
|
||||
_no_sed = False
|
||||
try:
|
||||
sed = Popen(['sed','--version'],stdout=PIPE,stderr=PIPE)
|
||||
sed.communicate()
|
||||
if sed.returncode != 0:
|
||||
_no_sed = True
|
||||
except WindowsError:
|
||||
_no_sed = True
|
||||
|
||||
if _no_sed:
|
||||
logger.warn("sed is not available => can't test stripAd_yahoo")
|
||||
del FilterTests.test_stripAd_yahoo
|
||||
del FilterTests.test_stripAd_yahoo
|
||||
|
||||
try:
|
||||
import libxml2
|
||||
|
@ -73,6 +73,14 @@ class SpiderTest(unittest.TestCase):
|
||||
self.spiderFeed(testfeed % '1b')
|
||||
self.verify_spiderFeed()
|
||||
|
||||
def test_spiderFeed_retroactive_filter(self):
|
||||
config.load(configfile)
|
||||
self.spiderFeed(testfeed % '1b')
|
||||
self.assertEqual(5, len(glob.glob(workdir+"/*")))
|
||||
config.parser.set('Planet', 'filter', 'two')
|
||||
self.spiderFeed(testfeed % '1b')
|
||||
self.assertEqual(1, len(glob.glob(workdir+"/*")))
|
||||
|
||||
def test_spiderUpdate(self):
|
||||
config.load(configfile)
|
||||
self.spiderFeed(testfeed % '1a')
|
||||
|
@ -24,3 +24,11 @@ class SpliceTest(unittest.TestCase):
|
||||
self.assertEqual(8,len(doc.getElementsByTagName('entry')))
|
||||
self.assertEqual(3,len(doc.getElementsByTagName('planet:source')))
|
||||
self.assertEqual(11,len(doc.getElementsByTagName('planet:name')))
|
||||
|
||||
def test_splice_new_feed_items(self):
|
||||
config.load(configfile)
|
||||
config.parser.set('Planet','new_feed_items','3')
|
||||
doc = splice()
|
||||
self.assertEqual(9,len(doc.getElementsByTagName('entry')))
|
||||
self.assertEqual(4,len(doc.getElementsByTagName('planet:source')))
|
||||
self.assertEqual(13,len(doc.getElementsByTagName('planet:name')))
|
||||
|
@ -30,11 +30,15 @@ a:active {
|
||||
a:focus {
|
||||
}
|
||||
|
||||
a.active {
|
||||
a.inactive {
|
||||
color: #558;
|
||||
}
|
||||
|
||||
a.rising {
|
||||
font-weight: bold;
|
||||
}
|
||||
|
||||
h1 {
|
||||
body > h1 {
|
||||
font-size: x-large;
|
||||
text-transform: uppercase;
|
||||
letter-spacing: 0.25em;
|
||||
@ -74,6 +78,11 @@ h1 {
|
||||
border-bottom: 1px solid #ccc;
|
||||
}
|
||||
|
||||
#sidebar h2 a img {
|
||||
margin-bottom: 2px;
|
||||
vertical-align: middle;
|
||||
}
|
||||
|
||||
#sidebar p {
|
||||
font-size: x-small;
|
||||
padding-left: 20px;
|
||||
@ -163,6 +172,10 @@ h1 {
|
||||
text-decoration: none !important;
|
||||
}
|
||||
|
||||
#sidebar input[name=q] {
|
||||
margin: 4px 0 0 24px;
|
||||
}
|
||||
|
||||
/* ---------------------------- Footer --------------------------- */
|
||||
|
||||
#footer ul {
|
||||
@ -177,6 +190,10 @@ h1 {
|
||||
display: inline;
|
||||
}
|
||||
|
||||
#footer ul li ul {
|
||||
display: none;
|
||||
}
|
||||
|
||||
#footer img {
|
||||
display: none;
|
||||
}
|
||||
@ -419,7 +436,7 @@ math[display=block] {
|
||||
overflow: auto;
|
||||
}
|
||||
|
||||
.eqno {
|
||||
.numberedEq span, .eqno {
|
||||
float: right;
|
||||
}
|
||||
|
||||
|
@ -40,7 +40,7 @@
|
||||
<xsl:text> </xsl:text>
|
||||
</div>
|
||||
|
||||
<h1>Subscriptions </h1>
|
||||
<h1>Footnotes</h1>
|
||||
<xsl:text> </xsl:text>
|
||||
|
||||
<div id="sidebar">
|
||||
@ -80,6 +80,7 @@
|
||||
|
||||
<xsl:text> </xsl:text>
|
||||
<div id="footer">
|
||||
<h2>Subscriptions</h2>
|
||||
<ul>
|
||||
<xsl:for-each select="planet:source">
|
||||
<xsl:sort select="planet:name"/>
|
||||
|
@ -45,11 +45,14 @@ function navkey(event) {
|
||||
if (!checkbox || !checkbox.checked) return;
|
||||
|
||||
if (!event) event=window.event;
|
||||
key=event.keyCode;
|
||||
if (event.originalTarget &&
|
||||
event.originalTarget.nodeName.toLowerCase() == 'input' &&
|
||||
event.originalTarget.id != 'navkeys') return;
|
||||
|
||||
if (!document.documentElement) return;
|
||||
if (!entries[0].anchor || !entries[0].anchor.offsetTop) return;
|
||||
|
||||
key=event.keyCode;
|
||||
if (key == 'J'.charCodeAt(0)) nextArticle(event);
|
||||
if (key == 'K'.charCodeAt(0)) prevArticle(event);
|
||||
}
|
||||
@ -215,14 +218,12 @@ function moveSidebar() {
|
||||
|
||||
var h1 = sidebar.previousSibling;
|
||||
while (h1.nodeType != 1) h1=h1.previousSibling;
|
||||
h1.parentNode.removeChild(h1);
|
||||
if (h1.nodeName.toLowerCase() == 'h1') h1.parentNode.removeChild(h1);
|
||||
|
||||
var footer = document.getElementById('footer');
|
||||
var ul = footer.firstChild;
|
||||
while (ul.nodeType != 1) ul=ul.nextSibling;
|
||||
footer.removeChild(ul);
|
||||
sidebar.insertBefore(ul, sidebar.firstChild);
|
||||
var h2 = document.createElement('h2');
|
||||
h2.appendChild(h1.firstChild);
|
||||
var ul = footer.lastChild;
|
||||
while (ul.nodeType != 1) ul=ul.previousSibling;
|
||||
|
||||
var twisty = document.createElement('a');
|
||||
twisty.appendChild(document.createTextNode('\u25bc'));
|
||||
twisty.title = 'hide';
|
||||
@ -239,10 +240,19 @@ function moveSidebar() {
|
||||
ul.style.display = display;
|
||||
createCookie("subscriptions", display, 365);
|
||||
}
|
||||
|
||||
var cookie = readCookie("subscriptions");
|
||||
if (cookie && cookie == 'none') twisty.onclick();
|
||||
h2.appendChild(twisty);
|
||||
sidebar.insertBefore(h2, sidebar.firstChild);
|
||||
|
||||
for (var node=footer.lastChild; node; node=footer.lastChild) {
|
||||
if (twisty && node.nodeType == 1 && node.nodeName.toLowerCase() == 'h2') {
|
||||
node.appendChild(twisty);
|
||||
twisty = null;
|
||||
}
|
||||
footer.removeChild(node);
|
||||
sidebar.insertBefore(node, sidebar.firstChild);
|
||||
}
|
||||
|
||||
var body = document.getElementById('body');
|
||||
sidebar.parentNode.removeChild(sidebar);
|
||||
body.parentNode.insertBefore(sidebar, body);
|
||||
|
@ -1,4 +1,5 @@
|
||||
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
|
||||
xmlns:access="http://www.bloglines.com/about/specs/fac-1.0"
|
||||
xmlns:atom="http://www.w3.org/2005/Atom"
|
||||
xmlns:indexing="urn:atom-extension:indexing"
|
||||
xmlns:planet="http://planet.intertwingly.net/"
|
||||
@ -20,15 +21,29 @@
|
||||
<!-- Strip site meter -->
|
||||
<xsl:template match="xhtml:div[comment()[. = ' Site Meter ']]"/>
|
||||
|
||||
<!-- add Google/LiveJournal-esque noindex directive -->
|
||||
<!-- add Google/LiveJournal-esque and Bloglines noindex directive -->
|
||||
<xsl:template match="atom:feed">
|
||||
<xsl:copy>
|
||||
<xsl:attribute name="indexing:index">no</xsl:attribute>
|
||||
<xsl:apply-templates select="@*|node()"/>
|
||||
<xsl:apply-templates select="@*"/>
|
||||
<access:restriction relationship="allow"/>
|
||||
<xsl:apply-templates select="node()"/>
|
||||
<xsl:text> </xsl:text>
|
||||
</xsl:copy>
|
||||
</xsl:template>
|
||||
|
||||
<!-- popular customization: add planet name to each entry title
|
||||
<xsl:template match="atom:entry/atom:title">
|
||||
<xsl:text> </xsl:text>
|
||||
<xsl:copy>
|
||||
<xsl:apply-templates select="@*"/>
|
||||
<xsl:value-of select="../atom:source/planet:name"/>
|
||||
<xsl:text>: </xsl:text>
|
||||
<xsl:apply-templates select="node()"/>
|
||||
</xsl:copy>
|
||||
</xsl:template>
|
||||
-->
|
||||
|
||||
<!-- indent atom elements -->
|
||||
<xsl:template match="atom:*">
|
||||
<!-- double space before atom:entries -->
|
||||
|
39
themes/django/bland.css
Normal file
39
themes/django/bland.css
Normal file
@ -0,0 +1,39 @@
|
||||
body {
|
||||
margin: 50px 60px;
|
||||
font-family: Georgia, Times New Roman, serif;
|
||||
}
|
||||
|
||||
h1 {
|
||||
font: normal 4em Georgia, serif;
|
||||
color: #900;
|
||||
margin-bottom: 0px;
|
||||
}
|
||||
|
||||
.updated, .entry-tools {
|
||||
font: .8em Verdana, Arial, sans-serif;
|
||||
margin-bottom: 2em;
|
||||
}
|
||||
|
||||
#channels {
|
||||
float: right;
|
||||
width: 30%;
|
||||
padding: 20px;
|
||||
margin: 20px;
|
||||
margin-top: 0px;
|
||||
border: 1px solid #FC6;
|
||||
background: #FFC;
|
||||
}
|
||||
|
||||
#channels h2 {
|
||||
margin-top: 0px;
|
||||
}
|
||||
|
||||
#channels ul {
|
||||
margin-bottom: 0px;
|
||||
}
|
||||
|
||||
.entry {
|
||||
border-top: 1px solid #CCC;
|
||||
padding-top: 1em;
|
||||
}
|
||||
|
11
themes/django/config.ini
Normal file
11
themes/django/config.ini
Normal file
@ -0,0 +1,11 @@
|
||||
# This theme is an example Planet Venus theme using the
|
||||
# Django template engine.
|
||||
|
||||
[Planet]
|
||||
template_files:
|
||||
index.html.dj
|
||||
|
||||
template_directories:
|
||||
|
||||
bill_of_materials:
|
||||
bland.css
|
49
themes/django/index.html.dj
Normal file
49
themes/django/index.html.dj
Normal file
@ -0,0 +1,49 @@
|
||||
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
|
||||
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
|
||||
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
|
||||
<head>
|
||||
<title>{{ name }}</title>
|
||||
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
|
||||
<link rel="stylesheet" href="bland.css" type="text/css" />
|
||||
</head>
|
||||
|
||||
<body>
|
||||
|
||||
<h1>{{ name }}</h1>
|
||||
|
||||
<p class="updated">
|
||||
last updated by <a href="http://intertwingly.net/code/venus/">Venus</a>
|
||||
on {{ date }} on behalf of {{ author_name }}
|
||||
</p>
|
||||
|
||||
<div id="channels">
|
||||
<h2>Feeds</h2>
|
||||
|
||||
<ul>
|
||||
{% for channel in Channels %}
|
||||
<li>{{ channel.title }} by {{ channel.author_name }}</li>
|
||||
{% endfor %}
|
||||
</ul>
|
||||
</div>
|
||||
|
||||
{% for item in Items %}
|
||||
{% ifchanged item.channel_name %}
|
||||
<h3>{{ item.channel_name }}</h3>
|
||||
{% endifchanged %}
|
||||
|
||||
<div class="entry">
|
||||
{% if item.title %}<h4>{{ item.title }}</h4>{% endif %}
|
||||
|
||||
{{ item.content }}
|
||||
|
||||
<p class="entry-tools">
|
||||
by {{ item.channel_author }} on
|
||||
{{ item.date }} ·
|
||||
<a href="{{ item.link }}">permalink</a>
|
||||
</p>
|
||||
</div>
|
||||
{% endfor %}
|
||||
|
||||
</body>
|
||||
</html>
|
||||
|
Loading…
x
Reference in New Issue
Block a user