Minting ‘tag’ URIs

I've talked before about assigning a GUID URI to each file in the CMS. Internally the uniqueness comes from an ID number assigned in the database, but for the globally unique URI I need to add information supplied by the user, such as a domain name.

(By the way, Tim Bray has just pointed out how these URLs can actually solve real problems with Atom feeds.)

For the standard GUID URIs (the user can override them by setting a daizu:guid property) I'm using the tag URI scheme, as defined in RFC 4151.

The ‘specific’ part is just my ID number, since there's nothing more human-readable which will always be unique. The only alternative I can see is using the path combined with the revision number it was extant in (so that renames won't change the URI), but the trouble with that is you could end up with stale path information being published, with no easy way to get rid of it.

For the ‘taggingEntity’ I need the user to indicate what is appropriate, since it will vary for each installation. So you can put this in the XML configuration file in guid-entity elements. You need at least one to provide the default tagging entity, which can be a domain name or email address with a date at which it belonged to you. I'm allowing additional entities for particular paths, in case you want to use different entities for different websites. For example:

<guid-entity                  entity="daizucms.org,2006-04-25" />
<guid-entity path="ungwe.org" entity="ungwe.org,2006"          />

which would yield GUIDs like this:

  • tag:daizucms.org,2006-04-25:23
  • tag:ungwe.org,2006:61

I've had to specify a more precise date for the daizucms.org domain, since I just registered it, and you're not meant to use dates in the future. Later I can change the configuration to shorten that date, such as changing it to just ‘2007’ next year. That won't change any of the existing GUIDs, but will be used for any new ones.

< Parsing content as HTML fragments | Generating URLs from files >