[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Finding Feeds
> > My point here was being able to find THIS ONE ITEM in a feed.
> > Presuming you're viewing a single item then the meta-data would
> > indicate where to find this ONE item in a feed. If you were
looking
> > at a number of items, a la Slashdot's opening screen, then you'd
most
> > likely want the entire feed (or just the currently view scope).
>
> I am missing your point ;) Couldn't a particular item be identified
> by the tuple of the feed URI and the item URI, like
>
> ( "http://example.com/feed.rss", "http://example.com/item5.html")
>
> ? This might be serialised in the HTML for the page something like
>
> <link rel="rss-feed" href="http://www.example.com/feed.rss">
> <link rel="rss-item" href="http://example.com/item5.html">
>
> so you could the find the rss-item in the rss-feed.
>
> Or is it a matter of just putting IDREFs in the RSS, like
>
> <item id="5">
> ...
> </item>
>
> and then linking to http://example.com/feed.rss#5
> (yeah, yeah, this should be XPointer, but the idea is there)
>
> If you identify by URI, it's guaranteed unique, except that if you
> have multiple items referring to the same URI in the feed, they'll
> 'overwrite' each other. If you use IDREFs, it's unique within the
> *current* view of the feed, but not outside of it (unless the server
> guarantees them to be unique over time, like an ETag in HTTP).
>
> > The missing link in my idea is that feeds don't generally support
the
> > idea of one item being located this way. Take it one step
further
> > and give me a way to grab the XML data for the item instead of
the
> > HTML presentation.
>
> Can you give a real-world example of how you'd want to use this? I
> might be confusing a few different threads here...
Let's take it from the top, shall we? You're looking at a page that
contains news items. You'd like to know if it has a feed available.
Something in the page is available for your enlightened browser to
determine the feed source. That would satisfy your browser idea and
I think it's a good one.
To go another step, let's say you wanted to redirect one of the items
on the page to some other destination. Be that destination a mail
message, a blog, an instant message, aggregator or some other
destination. How do you extract that message without resorting to
some very imprecise scraping? This is where I'd like to be able to
get 'back' to the source of the data using a programmatic interface.
In order to keep the web pages 'lightweight' it seems like it would
be better for this no to be a mass of meta-data tacked into the
HTML. I'd like the page to have an XML source URI and an item
identifier to be applied against it.
Some reasons for wanting this are to get to more data than might be
shown on the HTML view. If I'm on a WAP phone, for example, but I
want to redirect the full feed material to an aggregator. Scraping
just the HTML would leave out a LOT of material. Sending it as meta-
data would overwhelm the memory in the phone. (AvantGo limits
anyone?). But if the phone supported detecting this source meta-data
and had a way to push that, well, then we'd be getting somewhere!
Don't let the data die.
Stop pushing it into presentation formats only to be poorly scraped.
Provide a way to get back to the original data in an XML format.
You raise a good point about the identifier not being feed-oriented.
Making it unique within the feed seems like a reasonable start.
Forcing global uniqueness doesn't seem necessary and would probably
irritate too many people. Using a GUID isn't hard but that's another
debate.
This wouldn't be very hard to code. Most of the stuff is coming out
of databases now. They've usually got some form of ID on the
record. Put that into a meta-data structure inside the HTML. Put
the source URI in the page itself. Put the item ID in a meta-tag
with the item. Browsers support scripting that can extract this data.
Then put a SOAP interface or even simple CGI that will return the
data in XML.
-Bill Kearney