[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [syndication] Categories or Channels
Julian,
> The advantage here of using an SMTP or NNTP transport is the store and
> forward. I can collect the new items when I'm ready, but I get all of
> them because the store and forward system in the middle has batched them
> up.
An active push from the originator is always best; no need to poll
unneccessarily, and push protocols like SMTP or NNTP provide a lot of features
such as automatic retransmission, buffering etc.. In that respect, those
technologies are generally preferred over something like ftp which doesn't have
built-in recovery mechanisms.
Polling for a handfull of item, formatted as RSS, works great for a small
website, or personal news, but for content syndication on a larger scale, it
doesn't work all that well.
> I think that what I want to see is similar to the way a newspaper is
> structured; the Business section, UK News, World News, Sport, Editorial
> Features, Technology section, Humour. Each one of these becomes a
> locally synthesized feed from many sources.
>
> The problem with this picture is the question of who and what (how?)
> categorizes a new incoming story and decides which bucket it goes into.
I'd say you're hitting the central issue in content syndication. I've been
reading this mailinglist for months now and only read about the external content
formats used to exchange content. Of course, that's an issue as well, but in
practice, it is much less complex than the real problem: trying to get
categorization/classification right.
Categorization based on simple attributes such as the originating feed only go
so far. If you want a large list of categories/channels (i.e. a fine grained
category structure) you need more to do a good classification. In practice,
manual classification is something that can help out very well, if done right
(i.e. supported by tools).
> I'm not sure, but I think Moreover use a combination of automatic and
> human effort. I can't believe they have people scanning every item from
> 1800 sources and deciding which of 700 buckets to drop them in. But at
> the same time, they seem to achieve an accuracy higher than I'd expect
> from a purely automatic system.
Not sure what Moreover is doing behind the scenes, but I imagine it is similar
to what we do ;)
Just my .2 cents,
Bas.
--
syndication architect
YourNews
Crystal Building
Rivium Boulevard 213
2909 LK Capelle a/d IJssel
The Netherlands
T +31 (0)10 266 0 2 66
F +31 (0)10 266 02 60
www.yournews.nl
We've got News!