[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [syndication] Re: RSS feed filtered by keywords?
> I've made this comment elsewhere (and it's not intended to be
> incendiary, though it probably takes a very liberal reading to hear
> me as jovial and constructive):
I think I know how that feels.
> If we ever reach a point in time where an automated system can ascribe
> "meaning" accurately to portions of an information base then I believe
> it is almost certain that there will be annotations in that information
> base. A useful system for navigation such a corpus will almost
> certainly make use of such "semantic" annotations. They will be part
> of a Smart system.
>
> We are, however, despite the efforts of some of the best minds of the
> past century, nowhere near realizing that system.
Perhaps we have been learning a good amount about how not to do it. If anything
this process of elimination may eventually divine results.
> In the current vacuum annotation of information is still useful, and we
> can already make use of such annotation to aid with manual
> classification (and automated classification based on manual
> techniques), as well as keyword-based indexing and searching. Google's
> technologies, however, show that the benefits of annotation information
> in the described vacuum are primarily benefits in degree (as in
> efficiency) rather than in kind.
>
> I.e., there are benefits to annotating information with metadata, but do
> not be fooled into thinking that even a complete and systematic
> annotation of a corpus gets us any closer to extracting "meaning" from
> information in an automated manner.
>
> Metadata will be necessary but is nothing like sufficient.
I agree. When I speak of metadata I'm not specifically pinning it down to any
one given format.
> > But without locally applied metadata using stuff like Google is the crude
club
> > we'd be stuck using.
>
> I think I agree. The Google box as accepted by the mass market (simple
> search please, thanks) is constrained by what it can infer from what the
> average user will plug in. The Lexis search is powerful but expects an
> expert user. The right software, however, should be able to do more for
> the novice and, ideally, get the results expected by the professional.
> I believe this is possible but it's a number of years down the road.
Unfortunately you're probably correct. It's my estimation that one way to hurry
it along is to expose the ridiculousness of synthetic searching /as a sole
pursuit/ and to insist on augmentation much nearer to the source, just not
necessarily by the author alone.
> To my mind there is a hierarchy of scalability (actually it's probably
> just a partial order) with respect to needle-finding technologies. The
> parts of it I see (in rough order of scalability - note that some
> strategies work better concurrently) look like this:
>
> - personally handling
> - manually "categorizing"
> - automatically categorizing
> - extracting keyword indices
> - using relationships between needles and haystacks
> - using "features" suitable for automatic classification
> - deriving "features" from "context"
Ok, now take these and transform how they're presented back to authoritative
users. As you recognize there's a context to be considered. Making the
application of annotation an exercise 'separate' from the creation of it but
during 'related' activities may hold some promise. As in, don't bug them while
they're transitioning on something being "done". Instead, nag them at 'more
appropriate' times for clarification.
You're dead right about avoiding making the user a slave to the data. That will
never get off the ground and many efforts have failed trying.
> Curry recipe?), but requires highly organized pre-processed haystacks to
> work -- and requires smarts on the desktop of the needle-searcher to be
> effective (and to maintain privacy).
Here you hit squarely on two significant points. Better queue processing (or
even the use of queues) and a steadfast avoidance of abusing the user's trust.
> It is quite possible to have the system figure out how to organize the
> haystack *and* figure out which specific needles are relevant to the
> searcher.
Yes. Fascinating stuff, eh?
-Bill Kearney