[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [syndication] site-wide metadata discovery
- To: <syndication@yahoogroups.com>
- Subject: RE: [syndication] site-wide metadata discovery
- From: "Chad Everett" <yahoogroups@jayseae.cxliv.org>
- Date: Thu, 16 Oct 2003 13:40:41 -0400
- Importance: Normal
- In-reply-to: <350411BE3B7FF4488CA7797C9F1E90A6D401D9@exchange.ibmi.informedbeverage.com>
I was thinking something similar about another file, and referencing that
file from robots.txt, instead of trying to make the references right in
robots.txt. I got hung up because it would cause an extra lookup to get
that file after finding the right file name, and that might not be ideal.
If we're doing that, why not use a META tag in the index document to point
to it, instead of mucking with robots.txt at all?
Plus, it sorta infringes on the elegance of doing it all within robots.xml.
Another thought would be to use comments in the robots.txt file. Something
akin to embedding javascript in HTML comments. Comments are widely accepted
in robots.txt and shouldn't cause any harm. Something like:
#Public-Feeds: myPublicFeeds.opml
#Format: http://www.opml.org/spec
#Title: Public Feed List
#Public-Feeds: feedindex.xml
#Format: http://purl.org/ocs/directory/0.5/
#Title: Public Feed List
#Subscriptions: mySubscriptions.opml
#Format: http://www.opml.org/spec
#Title: My Subscriptions
Perhaps even standardize on a line beginning with a comment, so something
like:
User-Agent: googlebot # what googlebot can't get
Disallow: / # everything
Would still be allowed in the file and not slow down processing by checking
every comment for one of the reserved words. Using multiple comment
characters (##, ###, ####) or some special sequence (#$, #*#, #@) could
accomplish the same thing if the "beginning of line" trick wouldn't work.
I think that having known values would be most useful.
-----Original Message-----
From: Bill Kearney [mailto:wkearney@syndic8.com]
Sent: Thursday, October 16, 2003 11:10 AM
To: syndication@yahoogroups.com
Subject: Re: [syndication] site-wide metadata discovery
I'd favor using a single directive inside robots that pointed to an external
document that had all of what you're suggesting.
The issues of breaking existing robots.txt parsers and running afoul of size
restrictions, not to mention the inevitable technical arguments about
"polluting
the purity of robots.txt"
And one unexpectly nice side effect:
User-agent: *
Disallow: /stupidFixedName.url
We're trying to get a foothold into letting existing sites with lots of
content
that are now bringing lots of RSS online share that index with us. Let's
not go
overboard and freak them out entirely.
But yes, the ideas you're suggesting regarding subscriptions and lists are
useful.
What might be useful would be to have a range of 'known list types' be
established.
http://whatever.example.com/fdml/1.0#subscriptions
http://whatever.example.com/fdml/1.0#site-list
http://whatever.example.com/fdml/1.0#users
Etc... (these are just examples, not valid URI)
This would let them be used in a number of different contexts as a way to
give
meaning (semantics) to the data. Seeing this applied to a list would mean
it's
a list of those things. The spec's definitions of those things would be the
explain just what's being asserted.
-Bill Kearney
----- Original Message -----
From: "Chad Everett" <yahoogroups@jayseae.cxliv.org>
To: <syndication@yahoogroups.com>
Sent: Thursday, October 16, 2003 10:39 AM
Subject: RE: [syndication] site-wide metadata discovery
> Sorry, my mind is wandering a bit further down this path. These are
perhaps
> premature questions, as if we can't do anything with robots.txt, they are
> irrelevant. But my mind's wandering anyway. :)
>
> So... If this is something usable, I wonder if it could be used for more
> than this one discovery? If yes, what would the best way to be to go
about
> it? Would something like this work/be interesting/be useful?
>
> Public-Feeds: myPublicFeeds.opml
> Format: http://www.opml.org/spec
> Title: Public Feed List
>
> Public-Feeds: feedindex.xml
> Format: http://purl.org/ocs/directory/0.5/
> Title: Public Feed List
>
> Subscriptions: mySubscriptions.opml
> Format: http://www.opml.org/spec
> Title: My Subscriptions
>
> This might make it a bit easier (and clearer) to add attributes to
> individual offerings, rather than trying to cram them all onto one line.
> It's been mentioned a few times that this is a Bad Thing with OPML, so why
> perpetuate that here if it's a feature that isn't desired?
>
> It could also allow for multiple instances of the same data in different
> formats.
>
> Or would something like this be better:
>
> Site-Index: default
> Public-Feeds: myPublicFeeds.opml
> Public-Feeds-Format: http://www.opml.org/spec
> Public-Feeds-Title: Public Feed List
> Public-Feeds: feedindex.xml
> Public-Feeds-Format: http://purl.org/ocs/directory/0.5/
> Public-Feeds-Title: Public Feed List
> Subscriptions: mySubscriptions.opml
> Subscriptions-Format: http://www.opml.org/spec
> Subscriptions-Title: My Subscriptions
>
> I think this might have more problems processing multiple formats, since
> there is no clearly defined delimiter in between similar entries. I also
> think it looks a lot less clean than the other method.
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/