Note: 2010 update
Our perspective is that the Microsoft effort to promote XPS as a language
supported directly by a printer RIP is over.
However, XPS has gained
some ground as a format because is "plays nicely" with .NET
New for 2007 - RedTitan are inviting customers to beta-test XPS support in EscapeE transform and viewer system. Now you can convert PDF, PCL or composite document to the new XPS format.
"We believe that XPS has the potential to revolutionise both printing and electronic document viewing. It's the most exciting part of the Vista release"
- Pete Henry. RedTitan President.
Contact email@example.com for support.
- Tightly defined XML format
- Container for JPEG, TIFF, PNG, TRUETYPE and text elements
- Zip compressed
- A page description language
- Complete encapsulation, multiple document capability
- Digital signatures
- A print format
- A viewing format
- An interchange format
- An archive format
- Good color support
- API using .NET
- Bitmapped fonts
- Type 1 (Postscript) fonts
- Vector fonts
Other potential image formats tainted by patent issues like JPEG2000 and JBIG2 are
also dropped from XPS.
XPS packages are stored in ZIP archive format. The document parts are linked in a hierarchical structure using an XML notation.
/_rels/.rels must exist as the logical root and its contents define the location of the "document list". In the example,
FixedSeq.seq lists all the documents that are contained in the package.
This example contains only a single document and the file
FixedDoc.doc contains a list of all the pages included. e.g.
Resources cited by each page are listed in the corresponding
Example.xps (needs XPS viewer)
Example.xps (inspect as ZIP)
Example.idf (needs EscapeE IDF)
Example.idf (inspect as XML)
|Is XPS the next big thing?||
The XML Paper Specification (XPS)
has been billed as a strong alternative to PDF. This paper discusses XPS in the context of alternative electronic document and print formats.
Given that XPS is the default way a Vista application will render output and Microsoft have published the format free of royalties, should Adobe be worried?
If the question is "Do we need a new document format?", the answer is yes! Trying to find a long term electronic archive format or even how to email a document to a friend is not a simple as it should be. When the CEOs new digital camera overwhelms the sales force Blackberries and you have spent ten minutes trying to explain about color depth, you will probably get the response "Shall I turn it into a PDF?"
In fact, the phrase "A Paperless Office" carries the promise of technology not fit for the purpose. Do you remember how long it took to persuade a colleague to upgrade to a modern version of Word before sending any more documents that only an archaeologist could read? PDF is a lot less trouble isn't it? We all get a free PDF viewer and when a page unexpectedly comes out completely black, Adobe sometimes rushes out a quick fix. The lack of backwards compatibility in ADOBE ACROBAT, proprietary formats and patent issues are a good incentive to find a new standards based open format.
|Safe with IBM?
The financial industry would quite like electronic documents to remain legible for a least a couple of years so that they can print a clear copy when the time comes to reposses your home. Now, lots of these planners have been sacked for choosing IBM AFP format when it was discovered that buying suitable printers locally, wiped out all the savings made by relocating the telephone call centers to Mumby, India. Although IBM AFP is well documented, it carries the large penalty of converting AFPDS to the IPDS printer format. You can't just send an IPDS file to a printer, a local computer has to conduct a two way conversation with the very picky IBM firmware. The price of IBM INFOPRINT software and the computer to support it add a lot to the cost of ownership.
Ironically, the native format of the original Xerox fast cut-sheet laser printers (Legacy formats LCDS, DJDE, Metacode) became a popular archive format with the "it's expensive so it must be good" community. Hampered by bitmapped fonts and very little color, LCDS is almost impossible to re-create without the resources of the printers disk. This often means you have to find a suitable magnetic tape transport [Probably unfair but you get the idea - Ed] or risk losing the odd font, logo, officers signature or barcode. Using an image format (like TIFF) does avoid the resources issue and, after all, incoming paper like letters from customers can only be scanned to an image format. However, if you want to add text searching to an image archive you will need some sort of supporting database to provide the indexing and retrieval features.
|PCL as a document format?||
Choosing a print format to archive, re-print or view is an intrinsically good idea as long as you can encapsulate the document. If you get a driver (or write the Page Description Language directly) to download all the required resources then Hewlett Packard PCL is a good choice. It is available on nearly every desktop printer and there are good tools to view and transform PCL - like RedTitan EscapeE. Sadly, PCL is not a single language so you have to take care in your choice of viewer.
- PCL/HPGL is a plotter language.
- PCL3GUI is a clever compression technology for inkjet printers.
- PCL6 (or PCL/XL) is whole new language that bears little relation to the original PCL.
In fact, PCL6 looks like an early attempt to convince Microsoft not to do XPS. PCL6 defines primitives that correspond to Windows Graphics Device Interface calls (the GDI drives the PC screen amongst other things) wrapped up in a laborious binary format that pretends to be stack based. However, in VISTA, Microsoft has changed the GDI to exploit acceleration features without the programmer having to access the hardware directly. This means the average program gets smoother animation (not just dancing paper-clips but window moves and drag and drop) and is a one reason for early VISTA adoption. The static version of the Vista GDI is embodied in XML Paper Specification. Now, HP dare not walk away from from HPGL and traditional PCL in case they force the user community into using standard Postscript (HP still sell printers) so PCL/XL contains the rest of PCL as a subset - HP just backed the wrong GDI!
PCL is a handy way for an application to drive a printer directly because it has a very simple concept of context and the command format is succinct. PCLXL as just a cheap way of writing an HP printer driver using the Microsoft SDK but no application programmer would write the HP PCLXL PDL directly.
The objection to using POSTSCRIPT, that is takes an age for the printer to RIP, has now largely gone away. Faster processors and cheap memory means that there is sufficient power in a typical printer to implement a complete programming language. Herein lies the problem; POSTSCRIPT IS a general purpose programming language. Creating a suitable environment to print the "hello world" example at the correct place on a page in the correct font is very hard work. Postscript printer diagnostics are usually restricted to a meaningless ...
%%[Error: <type>; OffendingCommand: <offending command> ]%%
to avoid accidently unwinding the stack all over the contents of TRAY 1.
Why do we need yet another format? [Beside it being a good living - Ed]
Well, we have all been betrayed by the existing formats in some way or another.
Anybody notice GIF being re-adopted over the technically superior PNG now the COMPUSERVE patents have lapsed?
Adobe was quick to embrace new clever formats like LZW, JBIG
without checking the patent issues. Now there are some PDF files that you are not allowed to view (for free)
Language and format issues
If a language is well defined (like Postscript) you can at least implement the language and hope to render a document in much the same way as it was intended. However, since there are infinite number of ways for Postscript to display text it is often just as hard to write a program to reliably find some text in a Postscript file (so indexing is still hard). PDF fails to address these sort of issues so Adobe and the user community has responded with formats like PDF/X-1A i.e. a well know way of writing a PDF that everybody understands. See also ISO standard PDF/A,
and AIIM standard PDF/E (phew)
In general, if you are creating a new format it is a good idea to give the implementor as little choice as possible. All TIFFs are not created equal! In PCL6 you can even change from "little endian" to "big endian" number notation on the fly. It might have seemed like a good idea at the time but checking every command, in case there is a surprise, consumes many cycles.
Adobe Acrobat has always found using Windows fonts distasteful. Now, part of the excitement is never quite knowing what glyph you are going to get. It is difficult to verify that a TrueType font is valid and sadly, Vista XPS (RC1 Build 5600) still seems to ignore Digital Signatures on fonts.
|Obscure and binary formats||
Part of the problem with electronic documents is knowing what is in them. Sometimes too much information is published. Checking the Microsoft Word DOC format file revision history and embedded comments can be very revealing. If we are to trust a document, you have to be able to audit the contents. This is especially a problem if you intend to digitally sign a document. Audit is often impossible in a proprietary format. Trouble-shooting a font or print problem is very tedious when faced with arcane compression algorithms, variable format tables or vague and misleading documentation. PDF is and has always been a proprietary format that ADOBE can change as it sees fit, documentation has always been late and wrong.
|Print integrity ||
In general, the real problem with using a print driver to do printing is that the driver loses all concept of the document intent and inspite of being tailored to work on a specific printer, bugs can go unnoticed until stimulated by new data. Typical problems run from how a particular printer manufacturer cares to treat bugs in the PDL, context errors and just plain old font problems. Business users contort the workflow with "print from archive" to try spot WYSIWYG issues early or bypass the driver using bespoke variable data merging software.
Postscript/PDF owns this playing field and is the nearly the only thing trusted by the Graphics Arts users. Windows itself does not know much about wide gamut color profiles and it's a lot easier to get a calibrated screen to work on a MAC.
Does XPS solve these problems?
Grudgingly - At face value it does! Microsoft have obviously studied the issue and come up with a working solution.
- XPS is published in XML format. This means it would be easy to extend. e.g. In the future a video clip could be added to the document view and be safely ignored by the printer. It is clear how a simple program can add to an XPS document. The content can be easily inspected or digitally signed. All that is required to view or print a document is encapsulated in the XML package and much more of the intended structure of the document is preserved in the XML.
- It is an open format that is documented. It is patented but royalty free. Given the chaos surrounding spurious claims to other formats, Microsoft assertion that it was patented to protect its use "for all", can almost be believed. (I think the Intellectual Property Department just went a little too far before Marketing restrained them)
- XPS uses zip compression. Winzip will happily open an XPS file. Compression conforms to RFC1950 and RFC1951
- XPS has free viewers on XP and VISTA - just like Acrobat but not yet as good.
- XPS can be created by any application using the free XPS writer driver.
- XPS is a container for other formats. Unlike PDF, image formats can be included without unnatural conversion. This means you
can use real world JPEG, PNG and TIFF images. TrueType fonts (or subsets) are included verbatim (mildly obfuscated to protect the copyright holder).
- Microsoft insist XPS is both an interchange format, the embodiment of the underlying GDI AND the native PDL for new range of printers. (It might be early days but I have yet to see any specific printer manufacturer announcements)
Just about all of the above could also be said of Open Document Format (ODF) from the OASIS OpenDocument Committee but XPS has a couple of advantages that cannot be ignored.
- Microsoft has the power to make printer manufacturers support XPS.
- ODF does not know much about color beyond RGB and doesn't do much about WYSIWYG printing. XPS does CYMK, RGB, N-Channel, Gray, and ICC. XPS is a print format. If XPS color works, it wins!
The Microsoft master stroke of submitting the Office Open XML File Formats to the european standard group ECMA
leaves the European Union confused on competition issues and ODF isolated. XPS is set for "de facto standard" status when early adopters have forced their parents to roll out Vista at the office.
Coming from the Microsoft stables [I wonder what Microsoft did with the other 50 million lines of code - Ed] it is bound to have a few problems. e.g. Implementors prefer detailed and useful error messages even if it does frighten the horses. The XPS viewer could do with a debug mode - I would rather have been told that bitmapped fonts are not supported than just have them replaced with a substitute scalable font - it might have been important!
Are Adobe worried?
Sure they are! They have just (12/12/2006) rushed out a "me too" XML based offering called
and the withdrawl of Adobe installation support for Office 2007 Word
"SAVE AS PDF" is no coincidence.
We at RedTitan think XPS is here to stay. If you want to migrate your PCL and PDF or create XPS from scratch check out our new XPS export format from EscapeE.