Planning: enhancements

These are things I definitely want to do, but haven't got around to yet. Starting with the most important ones.

Templating

I need to think really hard about whether to continue with Template Toolkit or cook up something else.

Either way, I need to provide the templates with the ability to call the ‘next template in the chain’ as it were, like SUPER or NEXT in Perl. Currently the Daizu::TTProvider has a kludge which allows a template to include another one with the rewriting of names turned off. This provides access to just one extra link in the chain, and only for templates which have their names rewritten by generators, not ones overridden in the content repository.

Uploading content

The daizu publish command should be able to upload any new or changed content to your web server, using rsync or something. Ideally it should be possible to configure Daizu to do whatever you want after the publication work is done, by writing a bit of Perl code or pointing it at some plugin module. This would allow you to have code to send pings when new blog articles appear, for example.

Publishing

Currently the ‘last modified’ timestamp for output files is tracked by simply avoiding overwriting files in the docroot which haven't actually changed. This means that the mtime of the files can be used by a web server for cache control. Since it might be desirable to publish content dynamically at some point, maybe these modification times should be recorded in a new column in the url table. Note that these wouldn't correspond to any existing timestamps tracked in the database, because for example a page might be changed simply because you've altered the templates. What I'm talking about is the time the output was last modified, not when the content was last changed. For this feature to work there would probably have to be a way of making this timestamp ‘null’ when you know the page can't be reliably cached.

The content type stored in the url table should probably be required. I'm pretty sure it never gets a null value, but that should be enforced. Also, it should be possible to set the default MIME type to use. Currently that's always ‘application/octet-stream’, but the output configuration element should have a setting to override this, just as Apache does.

Heading levels

Currently I've standardized on having the headings for subsections in articles be h3 elements. The name of the website goes in an h1, and an h2 is used for the title of the article itself. It's quite possible though that you might want to change this, perhaps just using an h1 for the article title. If you made such changes to the templates you would then have to go round adjusting all your articles to use the right heading levels.

I think the way to solve this is to declare that the headings used for subsections when you write the content follow some standard pattern, but when the article is published the the headings will get adjusted to match the templates. There would be some sort of configuration item to override the default.

Another reason to take this renumbering approach is that it would allow all article loader plugins to target a standard format.

This could all be implemented as an article filter plugin.

Output filters

It would be nice to be able to write plugins which filter the output after it has been generated, from whatever generator methods, and optionally make changes. You could use this to validate HTML output, check links, or collect statistics.

I think it would be best if this was somehow tied to the output element in configuration files so that you could, for example, turn on the HTML validator plugin for a particular website only.

Multi-page articles

I don't think I'll need these, but I thought it was important to include the feature to make sure the design was right. I think Daizu basically handles these in the right way, but there needs to be code in the templates to provide links to the next/previous page, and a list of page numbers.

XHTML extension namespaces

Provide a way to have extra XML namespace prefixes declared automatically when parsing fragments of XHTML content, like the daizu one is. This would be handy for plugins which want to do clever things with MathML or whatever.

HTML 4 or XHTML

There should probably be some switch you can throw to make the output XHTML. If I keep using Template Toolkit for the templating then that probably means I should switch the templates to producing XHTML, and then if you want HTML 4 output it can be more easily reserialized.

Language metadata

Blog feeds and HTML files for articles should declare their language.

Of course the real trick is figuring out how to support articles which are available in more than one language. I'm not quite sure what that will look like in the content repository.

License metadata

Allow metadata to specify a license, particularly Creative Commons ones, for articles and other content, and publish that information both as descriptive text on the web pages and as metadata in feeds.

There is a proposed standard for including license information in Atom feeds. RSS has an extension for Creative Commons licenses in particular. The Creative Commons people have some notes on including licensing information in syndication feeds.

Podcasting

Add support for using enclosures in feeds. You might get an enclosure if the article has a MIME type like ‘audio/mpeg’ for example.

Folksonomy tags and tag clouds

Perhaps there should be a plugin or something which generates pages for tags you've used on articles, so that the articles can link to local tag pages instead of the ones Technorati provide (where most of the relevant articles seem to be in Russian or Japanese). The tag pages would act as a navigation aid, providing a listing of all articles which have the tag.

If you're going to have tag pages then a ‘tag cloud’ plugin would be the next logical step, which would generate a page listing all the tags, perhaps with the most common ones highlighted. I think I've seen that on WordPress blogs, and the Guardian one (described here) is rather pretty:

Thumbnailing

The scaled_image method in Daizu::Gen should try to use Image::Epeg before falling back to Image::Magick because it's much faster, and it isn't Image::Magick.

DOM filters

Plugins which filter the content of articles are currently run in an arbitrary order.

The daizu program

Provide a -v option to turn on ‘verbose’ mode. This would cause a flag to be passed to various internal functions, particular the publishing ones, to get them to indicate what they're doing. A -n flag to make Daizu not actually change anything (generate output but don't save it into a file, don't actually update the database, etc.) woudl also be useful.