January 27, 2003

Apple Word Replacement Rumor and Information Structure Dreams

Rumor has it Apple is working on MS Word replacement. This would be a great thing if it would read native Word files seemlessly, but even better would be turning out valid HTML/XHTML. MS Word has always made a huge mess of our information with its conversion to something it "calls" HTML, it is not even passable HTML. One could not get a job using what Microsoft outputs as HTML as a work sample, heck, it would not even pass the laugh test and it may get somebody fired.

One of the downsides of MS Office products is that they are created for styling of information not marking up information with structure, to which style can hang. MS Word allows people (if the turn on or keep the options turned on) to create information sculptures with structure and formatting of the information. What Word outputs to non-Word formats is an information blob that has lost nearly all of its structure and functionality in any other format. It does not really have the format the Word document to begin with. What Web developers do is put the structure back into the information blob to recreate an information sculpture again.

You ask why is structure important? Structure provides the insight to know what is a header and sub-header. Structure provides the ability to discern bulleted lists and outlines. Structure makes it script-kiddie easy to create a table of contents. Structure makes micro-content accessible and easier to find with search. Structure provides better context. Structure provides the ability to know what is a quote from an external document and point to it easily. Structure provides ease of information portability and mobile access easier. These just name a few uses of structure.

Does MS Word have this structure capability? Yes, do people use it? No really. If people use it does MS Word keep the structure? Rarely, as it usually turns the structure into style. This is much like a somebody who spent months in the gym to build a well defined physique only to have the muscles removed to stuff their own shirt with tissue paper to give it the look of being in shape. Does the person with the tissue paper muscles have the ability to perform the same as the person who is really in shape? Not even close.

Structure is important not only for the attributes listed above, but also for those people that have disabilities and depend on the information being structured to get the same understanding as a person with out disabilities. You say MS Word is an accessible application, you are mostly correct. Does it create accessible information documents? Barely at best. The best format for information structure lay in HTML/XHTML/XML not in styles.

One current place that structure is greatly valuable is Internet search. Google is the top search engine on the Internet. Google uses the text in hyperlinks, the information in title tags, and information in the heading tags to improve the findability of a Web page. What are these tagged elements? Structure.

One of the nice things about a valid HTML/XHTML Web document is I can see it aqnd use it on my cell phone or other mobile devices. You can navigate without buttons and read the page in chunks. Some systems preparse the pages and offer the ability to jump between headings to more quickly get to the information desired.

These are just a few reasons I am intrigued with the Apple rumor. There is hope for well structured documents that can output information in a structured form that can validate to the W3C standards, which browsers now use to properly render the information on the page. I have very little hope in the stories that MS is working toward an XML storage capability for Office documents, because we have heard this same story with the last few Office releases and all were functional lies.



Web Mentions

This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License.