[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Finding Feeds

To: syndication@yahoogroups.com
Subject: Re: Finding Feeds
From: "Bill Kearney" <wkearney99@hotmail.com>
Date: Wed, 03 Oct 2001 02:19:00 -0000
In-reply-to: <20011002171111.C22003@mnot.net>
User-agent: eGroups-EW/0.82

> > My point here was being able to find THIS ONE ITEM in a feed.  
> > Presuming you're viewing a single item then the meta-data would 
> > indicate where to find this ONE item in a feed.  If you were 
looking 
> > at a number of items, a la Slashdot's opening screen, then you'd 
most 
> > likely want the entire feed (or just the currently view scope).
> 
> I am missing your point ;)  Couldn't a particular item be identified
> by the tuple of the feed URI and the item URI, like
> 
>  ( "http://example.com/feed.rss";, "http://example.com/item5.html";)
> 
> ? This might be serialised in the HTML for the page something like
> 
>   <link rel="rss-feed" href="http://www.example.com/feed.rss";>
>   <link rel="rss-item" href="http://example.com/item5.html";>
> 
> so you could the find the rss-item in the rss-feed.
> 
> Or is it a matter of just putting IDREFs in the RSS, like
> 
> <item id="5">
>   ...
> </item>
> 
> and then linking to http://example.com/feed.rss#5
> (yeah, yeah, this should be XPointer, but the idea is there)
> 
> If you identify by URI, it's guaranteed unique, except that if you
> have multiple items referring to the same URI in the feed, they'll
> 'overwrite' each other. If you use IDREFs, it's unique within the
> *current* view of the feed, but not outside of it (unless the server
> guarantees them to be unique over time, like an ETag in HTTP).
> 
> > The missing link in my idea is that feeds don't generally support 
the 
> > idea of one item being located this way.  Take it one step 
further 
> > and give me a way to grab the XML data for the item instead of 
the 
> > HTML presentation.
> 
> Can you give a real-world example of how you'd want to use this? I
> might be confusing a few different threads here...

Let's take it from the top, shall we?  You're looking at a page that 
contains news items.  You'd like to know if it has a feed available.  
Something in the page is available for your enlightened browser to 
determine the feed source.   That would satisfy your browser idea and 
I think it's a good one.

To go another step, let's say you wanted to redirect one of the items 
on the page to some other destination.  Be that destination a mail 
message, a blog, an instant message, aggregator or some other 
destination.  How do you extract that message without resorting to 
some very imprecise scraping?  This is where I'd like to be able to 
get 'back' to the source of the data using a programmatic interface.  
In order to keep the web pages 'lightweight' it seems like it would 
be better for this no to be a mass of meta-data tacked into the 
HTML.  I'd like the page to have an XML source URI and an item 
identifier to be applied against it.

Some reasons for wanting this are to get to more data than might be 
shown on the HTML view.  If I'm on a WAP phone, for example, but I 
want to redirect the full feed material to an aggregator.  Scraping 
just the HTML would leave out a LOT of material.  Sending it as meta-
data would overwhelm the memory in the phone. (AvantGo limits 
anyone?).  But if the phone supported detecting this source meta-data 
and had a way to push that, well, then we'd be getting somewhere!

Don't let the data die.  

Stop pushing it into presentation formats only to be poorly scraped.  
Provide a way to get back to the original data in an XML format.

You raise a good point about the identifier not being feed-oriented.  
Making it unique within the feed seems like a reasonable start.  
Forcing global uniqueness doesn't seem necessary and would probably 
irritate too many people.  Using a GUID isn't hard but that's another 
debate.

This wouldn't be very hard to code.  Most of the stuff is coming out 
of databases now.  They've usually got some form of ID on the 
record.  Put that into a meta-data structure inside the HTML.  Put 
the source URI in the page itself.  Put the item ID in a meta-tag 
with the item.  Browsers support scripting that can extract this data.

Then put a SOAP interface or even simple CGI that will return the 
data in XML.

-Bill Kearney

Follow-Ups:
- Re: [syndication] Re: Finding Feeds
  - From: Mark Nottingham <mnot@mnot.net>

References:
- Re: [syndication] Re: Finding Feeds
  - From: Mark Nottingham <mnot@mnot.net>

Prev by Date: Re: Finding Feeds
Next by Date: Re: [syndication] Re: Finding Feeds
Previous by thread: Re: [syndication] Re: Finding Feeds
Next by thread: Re: [syndication] Re: Finding Feeds
Index(es):
- Date
- Thread