[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [syndication] XML Character encoding (again)
Bill Kearney <ml_yahoo@ideaspace.net> wrote:
Ian's right on all counts. The only way to do this realiably is to rigorously
examine the content being input and preview it back to the users. Convert it
all to UTF8.
Umm. Err.
I'm getting seriously pissed off with this. The site where this is
happening typically contains UK English text so we're talking about a
limited number of awkward characters. £, €, and smart quotes and
that's about it.
I'm really tempted to just say "tough". If you don't like the character
put in a "?". If you're parser barfs on my feed, well don't read it.
Programming hours are too short to start figuring out client browser
capability, UTF-8 conversion from arbitrary encodings and so on.
The point here is that it's RSS containing plain text read by human
beenz. I'm not trying to get 100% perfect transfer of data, I'm trying
to facilitate human communication.
Getting back to trying to solve this.
I'm genuinely puzzled that a CDATA block isn't enough to protect the
text byte stream from aggressive parsers.
And I wonder if I'm confusing everyone by suggesting UTF-8. Perhaps if I
used another encoding, the feed would be more likely to survive given
that the vast majority of users are generating this text with Wintel PCs
running IE.
--
Julian Bond Email&MSM: julian.bond@voidstar.com
Webmaster: http://www.ecademy.com/
Personal WebLog: http://www.voidstar.com/
CV/Resume: http://www.voidstar.com/cv/
M: +44 (0)77 5907 2173 T: +44 (0)192 0412 433