[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
discovery vs information
More thoughts on mypublicfeeds.
- I wonder if it would help to separate the discovery part of this from
the format part and try and solve these independently.
- I was thinking about use cases. A classic one for me is this. I was
looking at Zdnet UK, wireless section.
(http://news.zdnet.co.uk/communications/wireless/). I'm sure I remember
reading that there was some RSS on zdnet somewhere but I don't know
where, so what should I do next? There's no obvious XML link on the
visible page. So do I "view source" and look for <link> and/or start
looking for likely files containing lists in:-
http://news.zdnet.co.uk/communications/wireless/
http://news.zdnet.co.uk/communications/
http://news.zdnet.co.uk/
http://www.zdnet.co.uk/
In the absence of a *really widely* implemented standard like robots.txt
looking for files will just give me loads of 404s. Incidentally, there's
no robots.txt or favicon.ico in any of those zdnet directories either.
- There's enough different reasons for having lists and enough different
things to list that it feels to me like we need to solve this generally
and not just for rss. Even for rss, I feel the need to spec which
flavour each entry refers to.
- I think I've got three or four things I'd like to put in the header of
every html file. They're all optional and there might be more than one
of each.
1) Here's a pointer to single alternate version of the same content
2) Here's a pointer to a machine-readable file containing lists of
alternate versions of this content or related content. eg RSS0.92,
RSS1.0, RSS2.0, Atom, WML, Author's FOAF, Assorted metadata
3) Here's a pointer to a machine-readable file containing lists of files
related to this section of the site
4) Here's a pointer to a machine-readable file containing lists of files
related to this whole site.
1) needs more work on identifying the type of the target file. Not type
as in text/plain vs text/xml, but type as in RSS0.92 vs FOAF vs Atom
2), 3) and 4) need work on the markup approach and standard. I don't
think any of RDFS:seeAlso, OPML or OCS are actually good enough or
complete enough yet. If this is going to be general it needs to solve a
whole load of cases now and it absolutely must handle new file types.
And it had better be really simple to parse and produce too. There's
going to be a temptation to start adding all sorts of meta-meta-data
about each entry. Please resist this. It should be a simple list of file
type, name, URI. Any additional meta-data about each file should be
contained in the files themselves and available by collecting them.
3) and 4) are actually about metadata describing the directory
structure. I think there is a case here for the W3C to come up with a
standard way for this to be found and created. If they haven't already.
It feels like there's a case for a standard file with a standard name in
each directory. This would actually help robots because it could contain
sitemap lists of pages.
However, I think we actually already have a standard here. And that's
that web requests to directories with no file name should return
something via http and with a mime type. Either a web page, an index, a
404, a graphic or whatever. Now if the returned doc is of type text/html
then we're back to <link>
Aside: I wonder if creating new http headers is out of the question? ;-)
So. I think we need the following:-
1) Some standards for specifying target file content type in <link>
2) As well as RSSx.xx, Atom, NITF, FOAF etc, create some content types
for cases 1,2,3,4 above.
3) A defined way of creating new content types.
4) A standard file format for lists of <links> This needs to include a
section which is metadata about this list. We can probably do this in a
way that the entries can be inserted into all sorts of other types of
files.
Once we've got this far, then we can move to stage 5) viz. evangelising;
writing toolkits; writing apps; writing validators; arguing about common
locations and file names; arguing about whether it should be xml or RDF
or both; arguing about what it all means; and all that other stuff we're
so good at.
This seems to me to be a bare minimum. Once we've done that if the fixed
filename camp want to create these files with a fixed filename on their
webservers, then they can go ahead. Just as long as they do <link> too.
The fixed filename standard can then succeed or fail on it's own merits
without killing the file format standard in the process.
--
Julian Bond Email&MSM: julian.bond@voidstar.com
Webmaster: http://www.ecademy.com/
Personal WebLog: http://www.voidstar.com/
M: +44 (0)77 5907 2173 T: +44 (0)192 0412 433