Subversion properties in Daizu CMS

Daizu CMS uses metadata provided in properties in the Subversion repository. The Subversion book has more information about Subversion properties.

Property names

There is no standard for how Subversion properties should be named. The ‘special’ properties understood by Subversion all have names beginning with the prefix svn:, so Daizu uses the same syntax to qualify property names in order to avoid namespace conflicts with other applications. Properties which have meanings particular to Daizu use the daizu: prefix.

Some properties have meanings which are already covered by the Dublin Core metadata standards. I've used different prefixes for these property names so that it's clearer that they are meant to have the interpretations of the corresponding Dublin Core elements. It should make sense for any other applications interested in the same metadata to make use of these. The dc: prefix is used for elements defined in the Dublin Core Metadata Element Set, and dcterms: is used for elements defined in DCMI Metadata Terms.

There is nothing in Daizu which assigns any significance to the form of property names, or the prefixes used in them (if any), except that ‘property loader’ plugins can be assigned to be used only for properties of a given prefix. See the developer's documentation for the add_property_loader method in the Daizu class for details.

List of properties

The following list describes the property names which are understood by Daizu or the plugins which are included with it. There is, however, nothing to stop you using other properties for manual record keeping, or creating plugins which use other properties.

daizu:alt

Value to use for the alt attribute in an HTML img element. Currently only used by the plugin Daizu::Plugin::PictureArticle. It might make sense to provide an additional plugin which would filter HTML to add alt attributes to images if the image files themselves have this property, but I'm not sure.

The value should contain text encoded as UTF-8.

daizu:author

Contains the usernames of the people who should be credited with authoring an article. If there are is more than one author then the names should each be on a separate line. Usernames must not contain whitespace characters.

The usernames are looked up in the person and person_info tables in the database to find the real name of the person, and possibly other information like their email address. If there are no authors, either because there are no daizu:author properties on a file or any of its ancestor directories, or because the relevant property is empty, then the content is considered to be authored anonymously.

Set this property on a directory to indicate the authors of all the content inside the directory. For sites where all the content is created by one person, it is enough to set this property once, on a top-level directory. Subdirectories or individual files can have a different set of authors specified.

The usernames should be encoded as UTF-8.

You will get an error if a username can't be found in the database.

daizu:flags

If this is present, it should contain a list of flags, separated by whitespace. Currently the only recognized flags are:

retired
Indicates that an article should no longer be linked to from navigation menus, and should display a message indicating that it is out of date. The message will be taken from the daizu:reason-retired property if it exists, otherwise a default vague explanation will be provided.
no-index
Prevents a file from appearing in Google sitemaps. In the future this may also be used to prevent files from being indexed by a built-in search engine.
daizu:generator

This should contain the name of the Perl class which should act as the generator for the content. If used on a file then it only applies to the file. On a directory it applies to the directory itself and everything in it, unless overridden.

If no generator is specified then Daizu::Gen is used as the default.

daizu:guid

Should contain a URI to use as a globally unique identifier for a file. Daizu generates default GUIDs which should be adequate, so the only reason to set this property is when you've imported content from some other publishing system which used different identifiers. If so, it may be desirable to avoid the identifiers changing to the Daizu ones.

It is an error if the value isn't a valid URI.

daizu:nav-menu

Set this property on either an article file, or a directory containing an article with a name like _index.html to override the default navigation menu generated by Daizu::Gen. You might want to do this to include something that might not normally be included, like a blog homepage (because by default only article pages are included), or to leave out articles which aren't important enough to go in the menu, or to control the ordering of the pages.

The value describes the whole of the menu for a particular level of the site structure. Each line of text is a menu item, starting with the URL to link to (either an absolute URL or one relative to the permalink of the file you put the property on. There should then be some whitespace, followed by the title to use in the menu. Here's an example:

blog/      Blog
doc/       Documentation
download/  Download
license/   License

The value should contain text encoded as UTF-8.

daizu:reason-retired

If an article is marked as being retired then an explanation will be included on the page. There is a default explanation, but if you set this property then its contents will be used instead. If this proprety is set but has an empty value (no characters, not even whitespace) then the article won't carry any indication that it is retired.

The value should contain text encoded as UTF-8.

daizu:short-title

This can specify an alternative title to use, which is intended to be an abbreviated version of the one in dc:title. It is used in the navigation menu created by the default templates. In all cases this is optional, and the normal title value will be substituted if it isn't provided.

The plugin Daizu::Plugin::PodArticle sets this value automatically when it detects a POD file with a name consisting of a module name followed (as is traditional) by a dash and a description of the module. The whole of that is used for the title, and just the module name for the short title.

The value should contain text encoded as UTF-8.

daizu:tags

A list of folksonomy tags associated with a file. For articles these are published by the default templates at the bottom of the article.

Each tag should be on a separate line. The tags may contain spaces.

The value should contain text encoded as UTF-8.

daizu:type

If present, this should contain a name indicating what sort of file this is. Currently, the only value recognized is article, which indicates that a file's contents should be published as a web page using the TT templates. All articles must be marked with this property, or they will be published as unprocessed files, without any template processing. This setting is ignored on directories.

daizu:url

The URL at which a file should be published, or on a directory the URL at which its children should be based. Normally it is sufficient to set a single daizu:url property on the directory containing a website, and let the Daizu code figure out what URLs to use for the files and subdirectories inside it.

You might want to also set this on an individual file for an article if it has been imported from somewhere else. For example, when importing articles from another CMS, set a daizu:url value on each article containing it's old URL (if Daizu would otherwise give it a different one). You might then remove the property and republish the articles to get Daizu's default URLs, so that it generates redirects to keep the old ones working.

The value must be an absolute URL. Only URLs using the http scheme have been tested so far, but in principle other hierarchical schemes like ftp should work too.

dc:description

A short description of the article, typically a sentence or paragraph of text. This will be displayed under the title on the pages published for articles (at least by the default templates), and can also appear in blog feeds if they're not configured to use the actual content of the article. If you don't have a meaningful value to put here then it's probably best to leave it unset. Normally it's not a good idea to set it to the same value as the title.

The value should contain text encoded as UTF-8.

Corresponds to the ‘description’ element in Dublin Core.

dc:title

The title of the content in the file. For an article this will be used as by the templates for the title at the top of the article's web page (or pages).

As well as the titles of articles, this is also used on the top directory of a website or blog to set the title of the whole thing.

The value should contain text encoded as UTF-8.

Corresponds to the ‘title’ element in Dublin Core.

dcterms:issued

The date and time at which a file was issued. This might mean the time it was first published on a website, or the time it was created. If this isn't specified, Daizu will use the time at which the file was first committed. This is likely to be appropriate in most cases.

You might want to set a value for this if you're importing old content into Daizu, or if you want to ‘fake’ the time at which an article was published.

The value should be a date and time in the W3C format. If an invalid value is set, then a warning will be issued when it is examined, and it will be ignored.

Corresponds to the ‘issued’ term in Dublin Core.

dcterms:modified

The date and time at which the file was last meaningfully modified. If this isn't specified, Daizu will use the time at which the most recent change was committed to the file, which is usually appropriate.

You might want to set this if you're importing old content from somewhere else (in which case the time of the commit is not really the time the content was last modified, but rather the time it was imported). You can also use this if you want to override Daizu's idea of what constitutes a ‘meaningful’ revision.

The value should be a date and time in the W3C format. If an invalid value is set, then a warning will be issued when it is examined, and it will be ignored.

Corresponds to the ‘modified’ term in Dublin Core.

svn:mime-type

A MIME media type describing the type of the content in a file. Should not appear on directories. This is already used by Subversion to determine whether a file is textual and hence whether diffs should be provided for it. Daizu uses it to record the type of URLs generated from some files (which will be useful if you want to serve content dynamically), and to determine which article loader plugins should be used to load an article's content.

If an article file doesn't have a svn:mime-type property, Daizu uses text/html as the default. For unprocessed files, the default is application/octet-stream, at least if they are being published in the default way, through the unprocessed_urls_info method of the Daizu::Gen class.

The value should be the name of a MIME type, as defined in RFC 2046. Currently Daizu will have problems with MIME types which include options like the character encoding.