[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [syndication] Help Needed to Scrape a Disaster Recovery Wiki site into RSS
I've been in touch with the author of the site and it looks
like a good first step will be to syndicate the Wiki's
"Recent Changes" page. This is a nicely structured page
(http://kgate.virtual.net/cgi-bin/wiki.cgi?action=Browse&id=RecentChange
s).
The page appears like this:
September 16, 2001
OunceofPrevention . . . . . .
NewsandMirrors . . . . . .
AirlineInfo . . . . . .
PersonalStories . . . . . . sdsl-64-139-4-22.dsl.sca.megapath.net
TechAssistance . . . . . .
WikiHelp . . . . . .
SeptemberDisaster . . . . . .
September 17, 2001
InteractiveNYCMap . . . . . . ip.tech.arrakis.
Would it be possible to take the 15 most recent items (starting from the
bottom), and to create RSS <items> like this:
<item>
<title>InteractiveNYCMap</title>
<link>http://.....</link>
<description>September 17, 2001</description>
</item>
Note that the text to the right if the ". . . . . ." is simply
the address of the browser which changed the page. So this need
not be syndidated.
I know that several members of [syndication] would like to do this,
(James Carlyle, Mike Krus, and Bill Humpries have expressed interest).
I'll leave it up to the three of you to decide who will do this.
Jeff;
-----Original Message-----
From: James Carlyle [mailto:james@calaba.com]
Sent: Monday, September 17, 2001 1:39 AM
To: syndication@yahoogroups.com
Subject: RE: [syndication] Help Needed to Scrape a Disaster Recovery
Wiki site into RSS
Jeff
> Could someone with some scraping expertise and technology
> step in here and help? I've asked them to provide me with
> a list of URLs to be scraped - some parts of the site are
> text-oriented and not fully amenable to RSS. I should have
> this list by sometime early Sunday.
We have some scraping technology that we'd like to use. Please can you
post
a list of the urls to be scraped and we'll set up the software and
validate
the output.
James Carlyle
Your use of Yahoo! Groups is subject to
http://docs.yahoo.com/info/terms/