[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [syndication] html parsing as a horror story
Hey Bill, a couple of points:
1. Your stats page had nothing to do with it. Like many developers with a
big product and a small team, we have a queue of bugs and features. That it
took a few weeks to get it out says it was a *high* priority Bill.
2. I did the work, not Jake.
Dave
----- Original Message -----
From: "Bill Kearney" <wkearney99@hotmail.com>
To: <syndication@yahoogroups.com>
Sent: Thursday, July 18, 2002 6:11 PM
Subject: Re: [syndication] html parsing as a horror story
> > Here is an html parsing horror story...
> > Radio and RSS 0.92
> > AHHH!!!!
>
> You mean aaiiiieeeee!!! /runs screaming into the night.../
>
> In Userland's defense, they do try to resolve the stuff eventually. I
hounded
> them for weeks to get on the encoding bandwagon. It wasn't until the
statistics
> page at Syndic8 revealed the extent of which errors in encoding were
unfairly
> causing feeds to fail. They (Jake) were good about making attempts to
handle
> the bugs. Not as quickly as some might like, of course. There is still
an
> outstanding bug in how Radio creates and parses ampersands. They end up
> double-encoding all over the place by not using regexp lookups or entity
tables.
> Even UTF-8 stuff gets bastardized from ϧ wrongly into &999.
There's
> hope, sort of, in that the xml.entityEncode function within Radio is the
> bottleneck. Getting them to fix that script would solve the output side
of
> their encoding mistakes. Doing likewise on the decoding would presumably
take
> care of the rest.
>
> -Bill Kearney
>
>
>
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
>
>