[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [syndication] Re: Robot Discovery

To: syndication@yahoogroups.com
Subject: Re: [syndication] Re: Robot Discovery
From: Mark Nottingham <mnot@mnot.net>
Date: Wed, 3 Oct 2001 14:32:47 -0700
In-reply-to: <nkNV7WAx93u7EACU@netmarketseurope.com>; from julian_bond@voidstar.com on Wed, Oct 03, 2001 at 10:13:21PM +0100
References: <e4Rb8tABKru7EAkM@netmarketseurope.com> <9pffi0+jhr1@eGroups.com> <nkNV7WAx93u7EACU@netmarketseurope.com>
User-agent: Mutt/1.2.5i

Right. The problem is, for a robot to go and discover this, it has to
spider the site, which isn't very efficient. Or, you can put
references in the 'root' file to all of the user files, but then you
still need to maintain all of those links, which kind of gets us back
to square one; you might as well put the metadata in the Web pages
themselves.

The difference with .htaccess is that the Web server knows the
directory structure of the site; a spider with HTTP as the only
interface doesn't, unless you describe it to them.

On Wed, Oct 03, 2001 at 10:13:21PM +0100, Julian Bond wrote:
> In article <9pffi0+jhr1@eGroups.com>, doug@rds.com writes
> >One thing to consider is large/deep sites, such as multi-user sites 
> >using URLs of the form www.site.com/user. If there are thousands of 
> >users, a global discovery file could be huge. Perhaps what's needed 
> >is one per user, such as www.site.com/user/discovery.xml.
> 
> The way I constructed this (and you), it wouldn't be a problem.
> /userlist.xml       //list of user's discovery files
> /user/discovery.xml //xml for one user.
> 
> I'm not sure if the robots.txt has a similar idea. Is it supposed to
> work like .htaccess so that sub-directories over-ride parent
> directories?
> 
> -- 
> Julian Bond    email: julian_bond@voidstar.com
> CV/Resume:         http://www.voidstar.com/cv/
> WebLog:               http://www.voidstar.com/
> HomeURL:      http://www.shockwav.demon.co.uk/ 
> M: +44 (0)77 5907 2173  T: +44 (0)192 0412 433
> ICQ:33679568 tag:So many words, so little time
> 
>  
> 
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 
> 
> 

-- 
Mark Nottingham
http://www.mnot.net/

Follow-Ups:
- Re: Robot Discovery
  - From: "Bill Kearney" <wkearney99@hotmail.com>

References:
- Robot Discovery
  - From: Julian Bond <julian_bond@voidstar.com>
- Re: Robot Discovery
  - From: doug@rds.com
- Re: [syndication] Re: Robot Discovery
  - From: Julian Bond <julian_bond@voidstar.com>

Prev by Date: Re: [syndication] Re: Robot Discovery
Next by Date: Re: Robot Discovery
Previous by thread: Re: [syndication] Re: Robot Discovery
Next by thread: Re: Robot Discovery
Index(es):
- Date
- Thread