[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: One publishing problem keyword filtering might address



> Here's my particular scenario: Let's say I want to filter 100 feeds 
> using keywords. In my case, "keywords" isn't a meta thing -- I 
> actually mean "search terms." 

search terms and keyword are /very/ different things.  Asking a 
system to find based on keywords is, on the scale of things, 
trivial.  Doing the same with search terms can be orders of magnitude 
more complex.  

Keyword search for "news about hospital medical services " gets all 
sorts of hits.

Seaching for "news about about medical services" is a much more 
complex task.

I'm all for using search terms.

> I could also do this with Nexis -- almost. 

Just not for free.

> I could have Nexus email 
> me search results, but then I'd have to go and get URLs for those 
> articles every hour and update my site accordingly. An impossible 
> thing to do manually unless you do absolutely nothing else.

If you have the data coming in as mail it's pretty trivial to 
transmogrify that into other formats.  You'd probably be restricted 
on the republishing of Nexis data, however.

> Such a program would not solve every problem faced by every person 
> who has every asked about filtering RSS feeds using keywords -- but
> I think automated searches of the type I'm describing can work, 
> provided you're familiar enough with the behavior of the
> publications you're filtering and you're smart about use of search
> terms.

That's the tough part.  Gobs of research continues to be done on how 
to make it possible for people to search using methods that aren't 
strange.  My perspective on much of this is unless the data is 
annotated /to begin with/ it's going to be a nearly fruitless 
endeavor to do simplistic keyword searching.  

I suppose the statement is "if you don't annotate the data then don't 
expect miracles.  I you do annotate the data you will make 
it /possible/ for miracles to happen but there's no guarantee."

Anyway, we all seem to want the same thing, better ways to learn 
what's going on via syndicated data.  We only differ on the 
approaches.

-Bill Kearney