[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Persistence of items



Let's say you take an RSS feed from one source. You store the results in
a dbms. You collect the feed regularly. The source updates the feed by
adding items at the top and knocking them off the bottom so there's
always 15. Is there any easy way of identifying a particular item
between updates? I want the item to be persistent in my dbms so I can do
other things with it.

Is the only way to look for an exact match on the whole item, so the key
would be Title+link+description for my channel_id? 

In reality, how often does the link change for what is in fact the same
item, in which case I could just key on the link? But then RSS 0.92 made
link optional and typically Manila and RU RSS feeds have no item.link,
damn!

A related problem is that I receive the same RSS item from several
feeds. That is, the link is the same, the title is usually the same and
the description is similar, if not the same. Is it reasonable in the
general case to say that if the item.link is the same, the item is
referring to the same source story? 

-- 
Julian Bond eMail: julian@netmarketseurope.com
HomeURL: http://www.shockwav.demon.co.uk/ 
WorkURL: http://www.netmarketseurope.com/
WebLog: http://roguemoon.manilasites.com/
M: +44 (0)77 5907 2173  T: +44 (0)20 7420 4363  
ICQ:33679668 tag:So many words, so little time