[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [syndication] site-wide metadata discovery



> What are the circumstances in which robots.txt (or any other well-known
> file) will be needed?

Any time something programmatic is going to embark on discovering or otherwise
consuming 'some quantity' of URLs from a site.  We're pretty close to that here.

> When will it not be possible/desirable to use <link>?

Sites that have fractious and politically charged web site maintenance policies.
Getting crap onto the "home page" or buried into subsequent pages is often mired
in endless amounts of politcal maneuvering.  Doing the same for robots.txt is
usually nowhere near as difficult.

As in, the website admins have free reign inside robots.txt but have little or
no influence over the 'design' of the HTML pages.  Thus they're saddled with the
burden of 'educating' the powers-that-be.  We can all imagine why this is a
less-than-ideal way to engender their cooperation.

-Bill Kearney
Syndic8.com