[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Being kind to clients



There are, as several posts have suggested, many different issues 
here.  

In order to provide an all-around source of data most feeds 
contain 'more' than just what your particular reader wants to 
currently use.  For a new reader this is the only way to get 'caught' 
up.  For existing readers this means extra checking on the client-
side to avoid presenting duplicates.

Some feeds are static and could produce HTTP headers.  Most readers 
aren't bothering to examine this info.  But many servers aren't 
properly producing the header.  So it's a classic chicken and the egg 
situation.  The readers should make some attempt to check for this 
value.  Most have the ability to store a date from the last time a 
feed was read.  That could be used to make an end-run and avoid 
loading the data (and the subsequent duplicate checking).  This would 
do next to nothing in the 'save bandwidth' department but would save 
local CPU time and accuracy/duplicate issues.

Static servers could help out by presenting proper headers.  Clients 
can help by using those headers to avoid duplicates.  

Dynamically generated feeds aren't going to be able to do this.  The 
content creation date is the immediate execution time.  A client 
wouldn't be able to use their locally cached timestamp to avoid 
duplicates.

How would a client 'know' whether a feed is dynamic or static?  The 
couldn't do it reliably.  Using the header timestamps would at least 
let them handle the static stuff.

Feed formats do support some publication date fields.  This feature 
(and many others) would go a long way to helping clients make more 
informed decisions about what content to process.  It would do little 
to avoid wasting bandwidth.  The clients are faced with downloading 
the entire file just to see if that pubDate info has changed.  But it 
would help the client duplicate checking.

Supporting more of the existing elements in the feeds would be the 
first big help.  Getting clients to check for header timestamps would 
be the next step.  Doing likewise on servers would be good to 
implement in concert with the clients.  

Dynamic feeds?  These, oddly enough, are the biggest problem.  They 
can't 'know' what you really need to find out.  They could, however, 
be among the first places that 'smarter' clients could make more 
intelligent requests.  This would require changes to the various 
formats.  Given how that's been just SUCH a fun exercise in the past, 
it might be harder to implement politically than technically.

Get your servers to provide better headers.
Get your clients to read the headers.
Use a more complete range of existing feed format elements.
Get your clients to use those elements.

These steps would make a lot more sense than reopening the whole 
format migration debacle.

-Bill Kearney