[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [syndication] shared feed lists
- To: syndication@yahoogroups.com
- Subject: Re: [syndication] shared feed lists
- From: Jeremy Zawodny <jeremy@zawodny.com>
- Date: Tue, 14 Oct 2003 12:15:02 -0700
- In-reply-to: <03bd01c39280$80d75250$200ca8c0@wkearney.com>
- References: <20031014175836.GB1351@thermal> <03bd01c39280$80d75250$200ca8c0@wkearney.com>
- Sender: Jeremy Zawodny <jzawodn@thermal>
- User-agent: Mutt/1.5.4i
On Tue, Oct 14, 2003 at 02:25:10PM -0400, Bill Kearney wrote:
> > Based on my experience working more traditional publishing outlets
> > (think about local newspapers, tv/radio stations, web sites like
> > news.com, cnn.com and news.yahoo.com), their publishing technology is
> > often difficult or expensive to tweak, inflexible, or not sufficiently
> > transparent.
>
> Sure, and don't think I'm just arguing for the sake of arguing, either. In
> those cases do we want syndication to start pushing forth the bad practice of
> "blindly look for URL named X" and have those stick-in-the-mud sites start
> getting irritated with 404 requests for something that's not there?
I don't see any other low-barrier-to-entry way to get those folks in
the game. If I did, I'd have never suggested a well-known location
for the data.
> This is the prime reason I'm cautious about ideas like this. Let's not
> introduce ways to make the content providers get unduly 'concerned' about
> unexpected traffic. I'm all for beating them up and encouraging them to start
> offering RSS. But I've gotten plenty of negative feedback on the 'sudden
> appearance' of thing because of RSS consumption. I know it's silly, you know
> it's silly, but they've got the content and it's always a good idea to work
> within their sensibilities /when possible/.
What a lot of these providers need to learn is that if they begin to
provide RSSish feeds, many of the brute force scrapers (which use much
more bandwidth) will likely go away.
> > I'd rather not close the door on the possibility that for some sites
> > it may be much easier to get on the syndication bandwagon by writing
> > a script that updates a little file every hour (or day, or week) than
> > by figuring out how to introduce metadata into their HTML.
>
> I've never gotten any serious resistance to do it. I've seen more
> 'concern' from developers than anything I've ever gotten from
> content providers. That's saying, the people with the data have
> been willing to do it, it's the developers doing all the
> hand-wringing and waffling.
Well, I spent my first three years at Yahoo working on the Yahoo
Finance news feeds--doing syndication the hard way. It was rare to
work with a provider that could easily and willingly make a change in
their format, whether XML, HTML, or their own bastardized text markup.
The reasons were many. And this was with a contractual agreement in
place that had a non-trivial amount money attached to it.
It seemed needlessly stupid and frustrating at first, but over time I
had to accept it as fact. There are a lot of organizations that
publish good content but use a real mess of poor tools to do so. It's
not a problem we can solve, but it's one can't pretend does not exist.
> > I'm not sure how those offering data dynamically are burdened than
> > those offering it statically. Can you clarify that point?
>
> How does one take an existing dynamic script and, with no web server
> reconfig or cron jobs get that kept in sync as a static URL? As in,
> they can't get to mod_rewrite and the Options on their hosting
> provider don't make it possible.
Without a less abstract example, I'm not sure I follow. Nothing
against you, I'm just better with concrete examples than
generalities. Sorry.
> > > Second it opens the door to using poorly-mannered spiders that go
> > > digging for stuff that's never going to exist.
> >
> > Based on the web logs I've seen, that door has already been open,
> > removed from the hinges, and used as firewood. Many spiders are
> > dumb. There's no getting around that.
>
> Well, let's not give them yet another "don't do it this way"
> fallback to make it worse.
It's a tradeoff. What if the choice was between "don't offer a
fall-back" and "get these 5 big name news outlets on-line next month"?
> > > Besides, this also raises the hassle of the format being used ending
> > > up being far too limited. If we've got a shot at exposing valuable
> > > information it might as well be thought out.
> >
> > What raises that hassle? I don't understand.
>
> A fixed URL tells you nothing about what format it's contents will
> be.
Sure.
If you fetch http://jeremy.zawodny.com/blah.xml and I've chosen to put
HTML there instead of XML, so what?
If I decide to put a different type of XML there (SVG instead of
OPML), so what? The tool will figure that out and move on.
> As in, if someone wants to offer something a lot less dain-bramaged
> than OPML a fixed URL would leave them screwed.
Assuming that we decided on OPML, yes.
> > My original proposal was aimed at answering the question "what feeds
> > does this site provide?" What other valuable information are you
> > interested in having that also does not belong in the feeds
> > themselves? And how much of it do you believe should be required in
> > this [new] format?
>
> Required? Well, for a simple site index you'd absolutely right that
> it's a simple set.
Yes, like the non-option stuff on this list:
http://www.intertwingly.net/wiki/fdml/
> For using /one format for larger purposes/ it might be worth
> entertaining the idea of something just a /wee bit/ more robust.
Sure. That's why we discussed having optional elements and extension
via namespaces.
> I've been ruminating on a way to allow different desktop/pda/portal
> interfaces to share /really robust/ sets of feed data. As in, sync
> your work/home/pda/website feeds lists with their read/unread and
> other stats all within the embrace of an extensible format. I'm
> saying that has to happen /before/ site indexes but how about we
> pick something that'd really leave room for it?
Hmm.
> If we're going to set and example and, as a result, have the masses
> follow blindly along, then why not take a shot at making it a good
> one? After all we might only get one shot at it so let's load for
> bear.
The danger in showing people something more complex is that they
believe it is complex instead of simple. That's one thing I worry
about.
Jeremy
--
Jeremy D. Zawodny | Perl, Web, MySQL, Linux Magazine, Yahoo!
<Jeremy@Zawodny.com> | http://jeremy.zawodny.com/