[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

General purpose scrapers and bandwidth



I've just switched hosting. As a result I actually had a look at who was using RSSify[1] (http://www.voidstar.com/rssify.php) to convert plain websites into RSS. I've discovered that about 95% of my traffic is RSSify and Gnews2rss[2] (http://www.voidstar.com/gnews2rss.php).

So here's the announcement. I've started adding random news items for both systems that say.

"RSSify is fairly horrible hack that really shouldn't be necessary any more. Please consider using a weblog tool that generates RSS natively such as Blogger Pro or Movable Type[3]. Alternatively consider hosting RSSify yourself rather than using my bandwidth."

"If you're using Gnews2rss you presumably find it useful. Please email Google (news-feedback@google.com) asking them to produce RSS directly out of Google News Search. And why not host it yourself to save my bandwidth costs."

I'm not going to take these down and don't expect to turn them off any time soon. But I do wish the need for them would go away. I'd love it if a few more people hosted them and moved the code forwards. Particularly with gnews2rss, if Google won't produce RSS natively, then there ought to be 1000s of sites running scrapers.

[1]RSSify. Put span tags into your plain jane website, and I'll parse them into RSS [2]Gnews2RSS. Turn an arbitrary Google News Search into an RSS feed. eg give me the 15 most recent news stories that mention "wifi". [3]Apologies to other vendors that generate RSS. I couldn't put you all in.
--
Julian Bond Email&MSM: julian.bond@voidstar.com
Webmaster:              http://www.ecademy.com/
Personal WebLog:       http://www.voidstar.com/
M: +44 (0)77 5907 2173   T: +44 (0)192 0412 433