dinsdag, december 06, 2005

 

Why Microsoft Office Open XML is bad

Macros and Components

A document can also contain Visual Basic for Applications (VBA) code, toolbar modifications, OLE custom controls (OCX) and other "active" components. All of these items can be represented in WordprocessingML. In this section, you'll be introduced to how WordprocessingML stores VBA code and OCX controls. You'll also see how Word ensures that software can detect whether these components are present in the document so that the component can, for instance, be scanned for viruses. Word also ensures that if components are not made visible in WordprocessingML, they will not be executed.

For VBA code, a base64-encoded version of the binary file generated by the VBA editor is held in the binData element inside the docSuppData element. The binData element has a name attribute whose value must be set to "editdata.mso". The docSuppData element is a top-level element under the wordDocument root element, and follows the styles element in a document created by Word.

A typical VBA module in a WordprocessingML document looks like this:

<w:docSuppData>
<w:binData w:name="editdata.mso">
QWN0aXZlTWltZQAAAfAEAAAA/////wAAB/AbDwAABA
...more base64-encoded data...
LgBNAFkATQBPAEQAVQBMAEUAAABAAAAL8AQAAAASNFZ4
</w:binData>
</w:docSuppData>

Representing an OCX control in WordprocessingML is more complicated than storing VBA code because an OCX control also has a graphical representation in the document. For OCX controls, a binData element within a docOleData element is used to hold the OLE data. For OCX controls, the name attribute of the binData element must be set to "oledata.mso".

<w:docOleData> <w:binData w:name="oledata.mso"> 0M8R4KGxGuEAAAAAAAAAAAAAAAAAAAAAPgADAP7/CQAGAAAAAAAAA ...more base64-encoded data... C4zcL+WTKDhJozVltEGRkTOwQAROjpejLDyT5d+/F5BeLt5n3wv4P/Cl4BK= </w:binData> </w:docOleData>

...

(source:
Office "12" XML Schema Reference - PDC 2005 Preview, html format)


Comments:
In regard to VBA data, what are your expectations? That VBA code is expressed with XML? What you may want is the editor to be able to check an option to tell whether the VBA code should be plain text, or encoded. But at this point, the encoding is only because it is not XML semantics anyway.

For OLE data, the logic is even simpler. OLE is binary stuff.

Using their packaging convention, I believe you can also opt in to store each such component in a separate file, and reference it in a "relative parts" file, which is XML.

All in all, if your point is that VBA and OLE have no XML equivalent in Office 2006, you are perfectly right. If you think that there won't be better interoperability across products or platforms when it comes to VBA or OLE, you are perfectly right too. Note that, it's not only VBA and OLE.

Now for a lot of developers, they'll choose to simply ignore w:binData tags when they parse XML and get away from those troubles. In most use cases, they probably don't want to edit/change VBA or OLE anyway. The whole point is to be able to parse/repurpose the main XML, not necessarily some of the bits you mention.
 
An open format is only open when everything it describes is an open format. OpenDocument does this by using other standards and simply describing everything. Currently I am very afraid that Microsoft will use this loophole and store important data as ole-objects. That is the bad news i guess.
 
Een reactie plaatsen

Links to this post:

Een koppeling maken



<< Home

Archives

augustus 2003   september 2003   oktober 2003   november 2003   december 2003   januari 2004   februari 2004   maart 2004   april 2004   mei 2004   juni 2004   juli 2004   augustus 2004   september 2004   oktober 2004   november 2004   december 2004   januari 2005   februari 2005   maart 2005   april 2005   mei 2005   juni 2005   juli 2005   augustus 2005   september 2005   oktober 2005   november 2005   december 2005   januari 2006   februari 2006   maart 2006   april 2006   mei 2006   juni 2006   juli 2006   september 2006   november 2006  

This page is powered by Blogger. Isn't yours?