[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [syndication] site-wide metadata discovery



> > What are the circumstances in which robots.txt (or any other well-known
> > file) will be needed?
>
> Any time something programmatic is going to embark on discovering
> or otherwise
> consuming 'some quantity' of URLs from a site.  We're pretty
> close to that here.

Ok, supplementary questions :

	How did the agent 'hear' about the site?

	Where does the agent look first? (...and why isn't there a link tag there?)


> > When will it not be possible/desirable to use <link>?
>
> Sites that have fractious and politically charged web site
> maintenance policies.
> Getting crap onto the "home page" or buried into subsequent pages
> is often mired
> in endless amounts of politcal maneuvering.  Doing the same for
> robots.txt is
> usually nowhere near as difficult.
>
> As in, the website admins have free reign inside robots.txt but
> have little or
> no influence over the 'design' of the HTML pages.  Thus they're
> saddled with the
> burden of 'educating' the powers-that-be.  We can all imagine why
> this is a
> less-than-ideal way to engender their cooperation.

Ok, that makes sense.
Thanks for reminding me why I'm not doing sysadmin any more ;-)

Cheers,
Danny.