[Reader-list] .Net / Hailstorm Initiative

Fri Jul 6 18:53:36 IST 2001

On Fri, Jul 06, 2001 at 02:41:42PM +0530, Monica Narula wrote:

> This you say should work and be parsed by any other parser. Perhaps. 
> Theoretically.

Yes, that was only the above sample though, no guarantees for any other
stuff :) It's solid XML.

> But we have experienced this quite the opposite. When a word document 
> is saved as html, its not just the metadata which is encoded in the 
> unparseable way. 

If it's not pure XML it's naturally not parseable by an XML parser.

> The entire document is inherently replete with code 
> that cannot be removed without destroying the style of the document. 

The code you mention describes the style of the document, so naturally, 
it is not. 

> So, removing the top tag does nothing - and I was not able to run the 
> document on any other platform (e.g. Mozilla).

As I said before, an Excel sheet converted to HTML by Excel worked for 
me too in Netscape. I tried it again with another document today, worked
in Internet Explorer but made Netscape totally crash.
I was not surprised, I've written fully W3C compatible code that made 
netscape crash, so I can't say where the error lies. 

> This unfortunately increased my workload once by about 10 times as 
> all the documents that i had to make browser-compatible had to be 
> remade (and re-formatted!) in html!

My advice (though too late now :) is to never use any HTML editor and
always type in the code by hand. Dreamweaver however seems to produce
pretty clean code. 

Menso
-- 
---------------------------------------------------------------------
Anyway, the :// part is an 'emoticon' representing a man with a strip 
of sticky tape across his mouth.   -R. Douglas, alt.sysadmin.recovery
---------------------------------------------------------------------