Virtual Machine Manager - 0 views
XML Production Workflows? Start with the Web and XHTML - 0 views
-
Challenges: Some Ugly Truths The challenges of building—and living with—an XML workflow are clear enough. The return on investment is a long-term proposition. Regardless of the benefits XML may provide, the starting reality is that it represents a very different way of doing things than the one we are familiar with. The Word Processing and Desktop Publishing paradigm, based on the promise of onscreen, WYSIWYG layout, is so dominant as to be practically inescapable. It has proven really hard to get from here to there, no matter how attractive XML might be on paper. A considerable amount of organizational effort and labour must be expended up front in order to realize the benefits. This is why XML is often referred to as an “investment”: you sink a bunch of time and money up front, and realize the benefits—greater flexibility, multiple output options, searching and indexing, and general futureproofing—later, over the long haul. It is not a short-term return proposition. And, of course, the returns you are able to realize from your XML investment are commensurate with what you put in up front: fine-grained, semantically rich tagging is going to give you more potential for searchability and recombination than a looser, more general-purpose approach, but it sure costs more. For instance, the Text Encoding Initiative (TEI) is the grand example of pouring enormous amounts of energy into the up-front tagging, with a very open-ended set of possibilities down the line. TEI helpfully defines a level to which most of us do not have to aspire.[5] But understanding this on a theoretical level is only part of the challenge. There are many practical issues that must be addressed. Software and labour are two of the most critical. How do you get the content into XML in the first place? Unfortunately, despite two decades of people doing SGML and XML, this remains an ugly question.
-
Practical Challenges In 2009, there is still no truly likeable—let alone standard—editing and authoring software for XML. For many (myself included), the high-water mark here was Adobe’s FrameMaker, substantially developed by the late 1990s. With no substantial market for it, it is relegated today mostly to the tech writing industry, unavailable for the Mac, and just far enough afield from the kinds of tools we use today that its adoption represents a significant hurdle. And FrameMaker was the best of the breed; most of the other software in decent circulation are programmers’ tools—the sort of things that, as Michael Tamblyn pointed out, encourage editors to drink at their desks. The labour question represents a stumbling block as well. The skill-sets and mind-sets that effective XML editors need have limited overlap with those needed by literary and more traditional production editors. The need to think of documents as machine-readable databases is not something that comes naturally to folks steeped in literary culture. In combination with the sheer time and effort that rich tagging requires, many publishers simply outsource the tagging to India, drawing a division of labour that spans oceans, to put it mildly. Once you have XML content, then what do you do with it? How do you produce books from it? Presumably, you need to be able to produce print output as well as digital formats. But while the latter are new enough to be generally XML-friendly (e-book formats being largely XML based, for instance), there aren’t any straightforward, standard ways of moving XML content into the kind of print production environments we are used to seeing. This isn’t to say that there aren’t ways of getting print—even very high-quality print—output from XML, just that most of them involve replacing your prepress staff with Java programmers.
-
Why does this have to be so hard? It’s not that XML is new, or immature, or untested. Remember that the basics have been around, and in production, since the early 1980s at least. But we have to take account of a substantial and long-running cultural disconnect between traditional editorial and production processes (the ones most of us know intimately) and the ways computing people have approached things. Interestingly, this cultural divide looked rather different in the 1970s, when publishers were looking at how to move to digital typesetting. Back then, printers and software developers could speak the same language. But that was before the ascendancy of the Desktop Publishing paradigm, which computerized the publishing industry while at the same time isolating it culturally. Those of us who learned how to do things the Quark way or the Adobe way had little in common with people who programmed databases or document-management systems. Desktop publishing technology isolated us in a smooth, self-contained universe of toolbars, grid lines, and laser proofs. So, now that the reasons to get with this program, XML, loom large, how can we bridge this long-standing divide?
- ...44 more annotations...
-
I was looking for an answer to a problem Marbux had presented, and found this interesting article. The issue was that of the upcoming conversion of the Note Case Pro (NCP) layout engine to the WebKit layout engine, and what to do about the NCP document format. My initial reaction was to encode the legacy NCP document format in XML, and run an XSLT to a universal pivot format like TEI-XML. From there, the TEI-XML community would provide all the XSLT transformation routines for conversion to ODF, OOXML, XHTML, ePUB and HTML/CSS. Researching the problems one might encounter with this approach, I found this article. Fascinating stuff. My take away is that TEI-XML would not be as effective a "universal pivot point" as XHTML. Or perhaps, if NCP really wants to get aggressive; IDML - InDesign Markup Language. The important point though is that XHTML is a browser specific version of XML, and compatible with the Web Kit layout engine Miro wants to move NCP to. The concept of encoding an existing application-specific format in XML has been around since 1998, when XML was first introduced as a W3C standard, a "structured" subset of SGML. (HTML is also a subset of SGML). The multiplatform StarOffice productivity suite became "OpenOffice" when Sun purchased the company in 1998, and open sourced the code base. The OpenOffice developer team came out with a XML encoding of their existing document formats in 2000. The application specific encoding became an OASIS document format standard proposal in 2002 - also known as ODF. Microsoft followed OpenOffice with a XML encoding of their application-specific binary document formats, known as OOXML. Encoding the existing NCP format in XML, specifically targeting XHTML as a "universal pivot point", would put the NCP Outliner in the Web editor category, without breaking backwards compatibility. The trick is in the XSLT conversion process. But I think that is something much easier to handle then trying to
-
I was looking for an answer to a problem Marbux had presented, and found this interesting article. The issue was that of the upcoming conversion of the Note Case Pro (NCP) layout engine to the WebKit layout engine, and what to do about the NCP document format. My initial reaction was to encode the legacy NCP document format in XML, and run an XSLT to a universal pivot format like TEI-XML. From there, the TEI-XML community would provide all the XSLT transformation routines for conversion to ODF, OOXML, XHTML, ePUB and HTML/CSS. Researching the problems one might encounter with this approach, I found this article. Fascinating stuff. My take away is that TEI-XML would not be as effective a "universal pivot point" as XHTML. Or perhaps, if NCP really wants to get aggressive; IDML - InDesign Markup Language. As an after thought, i was thinking that an alternative title to this article might have been, "Working with Web as the Center of Everything".
-
I was looking for an answer to a problem Marbux had presented, and found this interesting article. The issue was that of the upcoming conversion of the Note Case Pro (NCP) layout engine to the WebKit layout engine, and what to do about the NCP document format. My initial reaction was to encode the legacy NCP document format in XML, and run an XSLT to a universal pivot format like TEI-XML. From there, the TEI-XML community would provide all the XSLT transformation routines for conversion to ODF, OOXML, XHTML, ePUB and HTML/CSS. Researching the problems one might encounter with this approach, I found this article. Fascinating stuff. My take away is that TEI-XML would not be as effective a "universal pivot point" as XHTML. Or perhaps, if NCP really wants to get aggressive; IDML - InDesign Markup Language. The important point though is that XHTML is a browser specific version of XML, and compatible with the Web Kit layout engine Miro wants to move NCP to. The concept of encoding an existing application-specific format in XML has been around since 1998, when XML was first introduced as a W3C standard, a "structured" subset of SGML. (HTML is also a subset of SGML). The multiplatform StarOffice productivity suite became "OpenOffice" when Sun purchased the company in 1998, and open sourced the code base. The OpenOffice developer team came out with a XML encoding of their existing document formats in 2000. The application specific encoding became an OASIS document format standard proposal in 2002 - also known as ODF. Microsoft followed OpenOffice with a XML encoding of their application-specific binary document formats, known as OOXML. Encoding the existing NCP format in XML, specifically targeting XHTML as a "universal pivot point", would put the NCP Outliner in the Web editor category, without breaking backwards compatibility. The trick is in the XSLT conversion process. But I think that is something much easier to handle then trying to
Slashdot | Pro-ODF Legislation Loses In Six States - 0 views
-
If this is the case then it greatly increases the scope of the bill from being a simple switch from MS Office to OpenOffice to a massive effort involving the definition of many new XML schemata, developing, testing and debugging software to handle the new schemata, creation of documentation, deployment of and training for the new software, etc., etc.
-
Another document format is not needed. This was already obvious before blogs took off, but to be promoting now is unforgivably stupid and irresponsible. Try and explain to an average person why all the typing they just did cannot even be viewed in a Web browser, they will not get it. Saving the user's typing as DOC or ODF is a con. The storage of text, styled text, graphics, photos, even movies (MPEG-4 H.264-AAC) has been solved. Your document format is ready it is HTML 4.01 Strict, CSS 2.1, and JS 1.5, there is nothing in the 1980's technology of MS Word that cannot be stored this way.
-
Bravo! Here's someone who gets it. XHTML + CSS3 + RDFa + RDF/XML is the winner. ODF is tied to OpenOffice, Sun's machiavellian monopolist machinations, and bound to a desktop only implementation range that is so retro 1995. MOOXML of course is bound to the MS Vista Stack, where desktop, server, device and web informaiton are all interoperable if only your speak perfect MOOXML, XAML, Smart Tags, and .NET
-
Comments on 'On the Office format wars' - 0 views
-
A fatal flaw in your analysis By Marbux Posted Saturday 21st April 2007 08:15 GMT Your analysis contains a fatal flaw, Martin. That is your belief that adequate Microsoft XML <> OpenDocument translators will be available. In fact, all of the translators suck mightily and there is no prospect at all of them being perfected. The major problems are: (i) that Microsoft's XML formats seem deliberately designed to thwart their parsing with XPath, which is essential to XML transformations; (ii) that Microsoft's "XML" file formats include binary blobs, bitmasks, and multiple Windows and Microsoft dependendencies, all of which defy XML transformations; and (iii) OpenDocument assumes a richer page layout engine than Microsoft Word provides, so while DOCX can be completely mapped to ODT it is impossible to fully map in the other direction without declaring an MS Office interoperability subset of OpenDocument and ODF applications implementing a compatibility mode with reduced features. (That is more than somewhat ironic, given Microsoft's spin that it couldn't implement all of its features in OpenDocument. In fact, the exact opposite is true.) In fact, Steve Ballmer is on record as saying that the developers of the Novell-Microsoft-Clever Age plug-ins will not even attempt to achieve full fidelity file translations between the two formats. http://www.eweek.com/article2/0,1895,2050848,00.asp?kc=EWEWEMNL103006EP17A Those translators achieve at best far less conversion fidelity than existing file conversion filters between OpenDocument and Microsoft binary file formats such as the OpenOffice.org conversion filters, which achieve only about 80 per cent fidelity. The file format cognescenti know this. See e.g., the paper by Gary Edwards and Sam Hiser included in this edition of the European Journal for the Informatics Professional. http://www.upgrade-cepis.org/issues/2006/6/up7-6Hiser.pdf (PDF). (Note that I contributed to that paper.) And as also detailed in that paper, what works well enough for some of us does not necessarily work well enough for all. Anything less than full fidelity data conversions is absolutely unacceptable in the context of wholly automated business processes and is in fact illegal in various contexts, including government records. So your thesis doesn't fly. In fact, I'd go so far as to bet that you have been suckered by the Microsoft spin doctors. Another indication is your depiction of the file format wars as being waged primarily between IBM and Microsoft, a recent theme of Microsoft's public relations machine. While it is seductive to believe that the controversy is just another chapter in the war between major competitors, the pro-ODF camp is far broader than IBM. For example, nearly 20 governments recently opposed fast track processing of Microsoft's draft standard at ISO. Do you believe they were all carrying water for IBM? Government bodies in more than 50 nations have chosen to adopt ODF. http://opendocumentfellowship.org/government/precedent And dozens of developers now support the OpenDocument standard in their applications. http://opendocumentfellowship.org/applications While IBM has had a noteworthy role in proliferating the OpenDocument formats, there is a movement without a recognizable leader in the industry. When it comes to vendor influence on things relevant to ODF, Sun Microsystem's far outshines IBM. But in fact, a core group of open standards and free and open source developers and advocates -- inside and outside government -- have played a far larger role. This is a customer-driven phenomenon, not a vendor-driven effort as you portray. So I will respectfully suggest that you reexamine your position on these issues. Reasonable minds can differ, but not on the grounds you advocate.
-
Here we go again. A couple of boot lickin lackies at The Register make some moronic statements about the OpenDocument XML file format, and the portable document cognisceti experts come out of the wood work to set the record straight. I think it's a scam to get boost hits.
Once again Marbux hands out a major bitch splappin to Microsoft shills who have no idea what's coming. What a great job Marbox does, and does with a kind consideration that certainly isn't warranted given the idiocy of the main article. Where does the man's patience come from? I gave up long ago.
~ge~
OpenForum Europe - EU Conclusions from Open Document Exchange Formats Workshop - 0 views
-
here was strong consensus among Member State administrations on the necessity to use ODEF on "openness" being the basic criteria of ODEF and resulting requirements towards industry players / consequences for public administrations There is a general dissatisfaction with the perspective of having competing standards; One format for one purpose: Administrations should be able to standardize (internally) on a minimal set of formats; No incomplete implementations, no proprietary extensions; Products should support all relevant standards and standards used should be supported by multiple products; Conformance testing and document validation possibilities are needed -> in order to facilitate mapping / conversion; Handle the legacy / safeguard accessibility
-
There must be something in the air. The end user inspired idea that applications should be able to exchange documents perfectly preserving the presentation (man percieved appearance as opposed to machine interpreted layout-rendering) is gaining a rabid momentum.
Yesterday it was the Intel ODF Test Suite results falling into the hands of Microsoft, who is now using the results to argue that OpenOffice doesn't fully support - implement ODF. The Intel ODF Test Suite is notable in that the test is near 100% about comparative "presentation" :: an object to object ocmparison of a KOffice document to an OpenOffice rendering of that document and vice versa.
Today we have the EU IDABC hosting a continent wide conference discussing the same issue :: the "exchange" of ODF documents. They've even gone so far as to coin a new term; ODEF - OpenDocument Exchange Format!
This morning i also recieved an invite to join a new OASIS discussion list, "The DocStandards Interoperability List". The issue? The converision and exchange of documents between different standards.
And then there is the cry for help from Sophie Gautier. This is an eMail that has worked it's way up to both the OASIS ODF Adoption TC and OASIS ODF Mainline TC discussion lists. The problem is that Microsoft is presenting the Intel ODF Test Results to EU govenrments. Sophie needs a response, and finds the truth hard to fathom.
Last week the legendary document processing expert Patrick Durusau jumped into the ODF "Lists" embroglio with his concern that the public has a different idea about document exchange - interoperability than the ODF TC. A very different idea. The public expects a visual preservation of the documents presentation qual
Microsoft playing three card monte with XML conversion, with Sun as the "outside man" w... - 0 views
-
In a highly informative post to his Open Stack blog Wednesday, Edwards explains how three key features are necessary for organizations to convert to open formats. These are: Conversion Fidelity - the billions of binaries problem Round Trip Fidelity - the MSOffice bound business processes, line of business integrated apps, and assistive technology type add-ons Application Interop - the cross platform, inter application, cross information domain problem
-
Dana Blankenhorn posted this article back in March of 2007. It was right at the time when the OASIS ODF TC and Metadata XML/RDF SC (Sub Committee) were going at it hammer and tong concerning three very important file format characteristics needed to fulfill a real world interoperability expectation:
.... Compatibility - file format level interop -
::: backwards compatibility / compatibility with existing file formats, including the legacy of billions of binary Microsoft documents
....... Interoperability - application level interop-
:::::: application interoperability including interop with all Microsoft applications
Microsoft Watch Finally Gets it - It's the Business Applications!- Obla De OBA Da - 0 views
-
To be fair, Microsoft seeks to solve real world problems with respect to helping customers glean more value from their information. But the approach depends on enterprises adopting an end-to-end Microsoft stack—vertically from desktop to server and horizontally across desktop and server products. The development glue is .NET Framework, while the informational glue is OOXML.
-
OOXML is the transport - a portable XML document model where the "document" is the interface into content/data/ and media streaming.
The binding model for OOXML is "Smart Documents", and it is proprietary!
Smart Documents is how data, streaming media, scripting-routing-workflow intelligence and metadata is added to any document object.
Think of the ODF binding model using XForms, XML/RDF and RDFA metadata. One could even use Jabber XMP as a binding model, which is how we did the Comcast SOA based Sales and Inventory Management System prototype.
Interestingly, Smart Documents is based on pre written widgets that can simply be dragged, dropped and bound to any document object. The Infopath applicaiton provides a highly visual means for end users to build intelligent self routing forms. But Visual Studio .NET, which was released with MSOffice 2007 in December of 2006. makes it very easy for application and line of business integration developers to implement very advanced data binding using the Smart Document widgets.
I would also go as far to say that what separates MSOOXML from Ecma 376 is going to be primarily Smart Documents.
Yes, there are .NET Framework Libraries and Vista Stack dependencies like XAML that will also provide a proprietary "Vista Stack" only barrier to interoperability, but Smart Documents is a killer.
One company that will be particularly hurt by Smart Documents is Google. The reason is that the business value of Google Search is based on using advanced and closely held proprietary algorithms to provide metadata structure for unstrucutred documents.
This was great for a world awash in unstructured documents. By moving the "XML" structuring of documents down to the author - workgroup - workflow application level though, the world will soon enough be awash in highly structured documents that have end user metadata defining document objects and
-
-
Microsoft seeks to create sales pull along the vertical stack between the desktop and server.
-
The vertical stack is actually desktop - server - device - web based. The idea of a portable XML document is that it must be able to transition across the converged application space of this sweeping stack model.
Note that ODF is intentionally limited to the desktop by it's OASIS Charter statement. One of the primary failings of ODF is that it is not able to be fully implemented in this converged space. OOXML on the other hand was created exactly for this purpose!
So ODF is limited to the desktop, and remains tightly bound to OpenOffice feature sets. OOXML differs in that it is tightly bound to the Vista Stack.
So where is an Open Stack model to turn to?
Good question, and one that will come to haunt us for years to come. Because ODF cannot move into the converged space of desktop to server to device to the web information systems connected through portable docuemnt/data transport, it is unfit as a candidate for Universal File Format.
OOXML is unfi as a UFF becuase it is application - platform and vendor bound.
For those of us who believe in an open and unencumbered universal file format, it's back to the drawing board.
XHTML+ (XHTML + CSS3 + RDF) is looking very good. The challenge is proving that we can build plugins for MSOffice and OpenOffice that can fully implement XHTML+. Can we conver the billions of binary legacy documents and existing MSOffice bound business processes to XHTML+?
I think so. But we can't be sure until the da Vinci proves this conclusively.
One thign to keep in mind though. The internal plugins have already shown that it is possible to do multiple file formats. OOXML, ODF, and XML encoded RTF all have been shown to work, and do so with a level of two way conversion fidelity demanded by existing business processes.
So why not try it with XHTML+, or ODEF (the eXtended version of ODF en
-
-
Microsoft's major XML-based format development priority was backward compatibility with its proprietary Office binary file formats.
-
This backwards compatibility with the existing binary file formats isn't the big deal Micrsoft makes it out to be. ODF 1.0 includes a "Conformance Clause", (Section 1.5) that was designed and included in the specification exactly so that the billions of binary legacy documents could be converted into ODF XML.
The problem with the ODF Conformance Clause is that the leading ODF application, OpenOffice, does not fully support and implement the Conformance Clause.
The only foreign elements supported by OpenOffice are paragraphs and text spans. Critically important structural document characteristics such as lists, fields, tables, sections and page breaks are not supported!
This leads to a serious drop in conversion fidelity wherever MS binaries are converted to OpenOffice ODF.
Note that OpenOffice ODF is very different from MSOffice ODF, as implemented by internal conversion plugins like da Vinci. KOffice ODF and Googel Docs ODF are all different ODF implementations. Because there are so many different ways to implement ODF, and still have "conforming" ODF documents, there is much truth to the statement that ODF has zero interoperabiltiy.
It's also true that OOXML has optional implementation areas. With ODF we call these "optional" implementation areas "interoperabiltiy break points" because this is exactly where the document exchange presentation fidelity breaks down, leaving the dominant market ODF applicaiton as the only means of sustaining interoperabiltiy.
With OOXML, the entire Vista Stack - Win32 dependency layer is "optional". No doubt, all MSOffice - Exchange/SharePoint Hub applications will implement the full sweep of proprietary dependencies. This includes the legacy Win32 API dependencies (like VML, EMF, EMF +), and the emerging Vista Stack dependencies that include Smart Documents, XAML, .NET 3.0 Libraries, and DrawingML.
MSOffice 2007 i
-
- ...6 more annotations...
The future of XML - 0 views
-
Of course, the most important conversion isn't from OpenDoc to OOXML or vice versa: it's a down conversion from either OpenDoc or OOXML to XHTML. The HTML exporters in OpenOffice and Microsoft Office are uniformly atrocious. Look for third-party developers to pick up the slack. Most important, look for individual corporate developers and webmasters to begin publishing custom templates for their sites. This will enable regular folks to write in Microsoft Word as they're accustomed to doing and then upload their musings straight into the local content-management system. Editing and reviewing tools can be built right in. Because machines generate all the markup (the humans see the GUI interface they're used to), well-formedness will be a freebie. The majority of the Web won't be well-formed by the end of 2008, but a larger percentage will be than today.
Q&A: Nicholas Carr on the big switch to utility computing - 0 views
-
I think we’re at the early stages of a fundamental shift in the nature of computing, which is going from something that people and businesses had to supply locally, through their own machines and their own installed software, to much more of a utility model where a lot of the computer functions we depend on are supplied from big, central stations, big central utilities over the Internet.
Cutting corners - the realpolitik of ODF standardisation? - The Wayback Machine Roars R... - 0 views
-
From Notes2Self 2006 post we discover once again that ODF Interop problems are not new. Back in early February 2005, top ranking OASIS Executive James Clark made a comment to the OASIS OpenDocument technical Committee about the lack of interoperability for spreadsheet documents:
".... I really hope I'm missing something, because, frankly, I'm speechless. You cannot be serious. You have virtually zero interoperability for spreadsheet documents. OpenDocument has the potential to be extraodinarily valuable and important standard. I urge you not to throw away a huge part of that potential by leaving such a gaping hole in your specification...". Claus Agerskov further commented that this provided a means of creating lock-in (my emphasis)
"OpenDocument doesn't specify the formulars used in spreadsheets so every spreadsheet vendor can implement formulars in their own way without being an open standard. This way a vendor can create lock-in to their spreadsheets"
1 - 11 of 11
Showing 20▼ items per page