[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [syndication] site-wide metadata discovery
- To: <syndication@yahoogroups.com>
- Subject: RE: [syndication] site-wide metadata discovery
- From: "Chad Everett" <yahoogroups@jayseae.cxliv.org>
- Date: Thu, 16 Oct 2003 15:09:10 -0400
- Importance: Normal
- In-reply-to: <350411BE3B7FF4488CA7797C9F1E90A6D401EB@exchange.ibmi.informedbeverage.com>
The idea of a hook to a "metadata index file" definitely has appeal. We
don't clutter the robots.txt, plus we don't specify a particular file name,
plus it could actually be used for sections of domains.
Consider:
#Site-Index: metadata.xml
#Index-Format: http://purl.org/ocs/directory/0.5/
#Allow-Recursion: true
Could tell an application that the site index exists in the file named
metadata.xml, is in ocs 0.5 format and that the file may exist at each level
of the domain (example.com/metadata.xml and
example.com/folder/metadata.xml), so if browsing something other than the
root, look for the file as it might be there with some different values.
Meanwhile:
#Site-Index: metadata.xml
#Index-Format: http://purl.org/ocs/directory/0.5/
#Allow-Recursion: false
Says the same thing, but that there is no recursion possible. Even if
metadata.xml exists at lower levels of the domain, don't bother looking for
it, as we are telling you not to do so. Some larger sites might like this
sort of feature as it could limit requests for data.
Another advantage to this is that the index itself could also be provided in
different formats, allowing for preferred formats or even selective parsing
of the data. If the application sees:
#Site-Index: metadata.xml
#Index-Format: http://purl.org/ocs/directory/0.5/
#Site-Index: metadata.opml
#Index-Format: http://www.opml.org/spec
And the application doesn't support opml or ocs, it can just not bother
fetching more data because it won't be able to parse the data anyway. End
of random reaching for files that might or might not be useful. Of course,
there will be some bad behavior where the doc doesn't match the format, or
the app grabs the data anyway, but I don't know that that can be prevented,
and this ought to help that situation greatly. No excuse for bad behavior
if you see something and still choose to get it, even if you can't parse the
data within...
Naturally this also raises two more questions:
- What is the default for recursion? True or false? False may mean less
traffic. True may mean more results (also could mean more errors!).
- What the heck format will support storing all the metadata for a site?
Consider that it might contain feeds, but other useful information could be
in there as well - subscriptions, location of foaf, pointer to rsd, etc...
-----Original Message-----
From: Danny Ayers [mailto:danny666@virgilio.it]
Sent: Thursday, October 16, 2003 2:46 PM
To: syndication@yahoogroups.com
Subject: RE: [syndication] site-wide metadata discovery
Nah, I think a little hook is elegant.