VA Linux Printing Summit Notes %description% Notes from VA Linux Printing Summit of July 27-28, 2000
The first VA Printing Summit was held on July 27-28, 2000. I attended, courtesy of VA itself, and took these day-by-day notes of the conference:
I'm here in Sunnyvale, California at the Open Source Printing Summit being hosted by VA Linux systems. The summit itself is an offshoot of the HP/VA "fix printing" partnership. A group of developers at VA is doing some work to make printing better, but any solution will obviously be beyond what they can do, so they've invited basically everyone involved with free software printing, plus an assortment of printer vendors, to gather here in Sunnyvale for two days of gabbing, arguing, head knocking, etc.
Day 0: A Three Hour Tour
Wed at 11am I drove to the airport in Providence, RI. I got there at 12:30, with just enough time to make a 1pm United flight for San Jose. That plane had loose screws on the pilot's window, necessitating a torque wrench and a mechanic trained in the operation of said wrench.
Three hours later, the mechanic had begun adjusting the second screw, and all options United offered had expired, so I had VA's travel service put me on a Delta flight. We got nearly 100m from the gate before the pilot discovered that the right engine wouldn't start. After a quick call to VA's travel folks and exhausting an entire cellphone battery while in line at the ticket counter, I was just able to buy a burrito and make a Northwest flight via Detroit, which put me in San Francisco at 1am.
Fortunately the rumors of bad California traffic appear to be greatly exaggerated; I saw no other cars and no roads less than four lanes wide. ;)
Day 1: Spoolers, Vendors, and APIHell
The first day was mostly dedicated to structured presentations to everyone by various of the larger projects. These included 20 minute background descriptions:
- L Peter
- on Ghostscript. A loadable driver project is in the planning stages for 7.0, and work for PDF 1.4 (which finally surpasses Postscript's facilities in a number of useful areas) is underway.
- on CUPS. CUPS is all-singing for many applications, and if my eavesdropping is up to par, this was news to a few of the vendors.
- on LPRng. Patrick appears to be the retired EE professor he is.
- on VA's work. They're now focused on printer configuration stuff? As usual, they'll overlap horribly with a dozen other projects. No matter; all's fair in love and code theft ;)
- Chema and Raph
- on Gnome. Gnome has big plans and quite a bit of code to show. The gnome canvas widget can or will soon also generate Postscript, PDF, PCL, or a Gnome Metafile. Preview is therefore trivial and efficient. Corba is a good thing.
- formerly of Corel, on sysAPS. They use this library in WordPerfect etc; it is a good basis for spooler-independant future work (and is of course built from my database; I need to talk with Brian about the new export format).
- of KDE. They basically sort of just use Qt's printing support, which is essentially identical to their canvas widget and makes Postscript.
- of IETF on IPP.
- HP, seemingly to provide a representative vendor viewpoint. He was asked - told, really - that we really want useful developer documentation more than we want all the higher-level initiatives they're doing now. HP offered numbers from IDC for Linux client (ie nonserver, deskjet space) market share: 4.14% new installations in 1999, and 5.54% new installations in 2004. Installed base in those years is 2.4% and 5.74%. Obviously the 2004 year numbers are completely made up, and undoubtedly wrong. Several folks pointed out various things indicating strong growth in the desktop area. (I, for one, don't care about numbers; as Robert later pointed out in the vendor session, one is a viable market size to sustain a free software development project).
Lunch was out on the patio; Jacob of PDQ and I ate with two Lexmark folks; one Lloyd from the evil disposable jetprinter division, and one Don from the delightful Optra division. We discussed various things, but the point came out that the bulk of the intellectual property they wish to keep secret is in fact above and beyond the minimum necessary information to produce good output with free software; we need only to know how to put dots in the proper places on the page, while they had rather thought we wanted a whole flock of dithering, halftoning, color, and other algorithms/tips/support to go with it. We pointed out then and later in a breakout session that free software has in fact got a perfectly good selection of algorithms and implementations for all of these things; it is mainly the inability to control the printhead that vendors can help us with. Other information is merely icing on the cake.
Back in session, the last few presentations finished up, and a flagrantly disorganized discussion of what to do started up. This came to a head with an HP fellow suggesting a model and Ben rather wanting to lock out the non-programmers and get something done already. Eventually it was decided to have two child sessions: one for API/model/etc stuff, and one for vendor relations. I attended the vendor relations one.
This discussion was attended by some free software developers and most or all vendors at the conference:
The discussion consisted of equal parts:
- Explaining the way free software development works to the vendors. Many are totally unfamiliar with it; the idea of programming a printer merely for the fun of solving a problem, for peer acclaim, or whatever does understandably seem strange to someone who does mainstream printer work as a traditional day job.
- Explaining what we need and discussing the ways vendors can provide this. The lines around all that top-secret dithering magic were drawn, and a straightforward plea for documentation about the squirting of ink drops was made. Various arrangements like "partial NDA" techniques were discussed, and the many disadvantages of binary-only vendor drivers were enumerated.
Epson won special congratulations for providing a sufficent level of documentation to allow the straightforward implementation of decent drivers for the Stylus printers; as a look at my color inkjet summary page will show, Epsons are currently the best thing going for Linux inkjet shoppers.
All the vendor-bashing completed, we moved to VA's building up the road for a parking lot BBQ featuring an irritatingly loud band that made conversation difficult. I talked variously with Jacob Langford (PDQ), Mike Sweet (CUPS), Nancy Chen (Okidata), Patrick Powell (LPRng), Mark VanderWeiele (IBM), and Ray Hsu (Epson). Of those in the API session (as opposed to the vendor session) the consensus seemed to be that it had rapidly devolved into a nonproductive heated discussion between various factions. Perhaps the morning startup "summary" presentation of the API discussion will reveal more useful progress...
After the BBQ, Patrick, Jacob, and I retreated to the hotel's gazebo and gabbed for a while. Much discussion of BIOSes, bootstrap code, RTC chips, university politics, youthful indiscretions, and motorcycles ensued.
Day 2: Drivers, Fonts, and APIs
Day two began with breakfast and mingling. I ended up in a discussion with L Peter (Ghostscript) about licensing; it became clear to us that he had not fully realized the chilling effect that gs's dual license has had on outside contribution to Ghostscript. Then Robert (stp/gimp-print) wandered over and provided the perfect case in point.
Things then settled down into a set of presentations about the previous day's two discussion groups (vendor relations and api). Robert presented the results of our vendor relations group. Besides what I remembered above, a few other points were highlighted and discussed:
- IP in protocol
- Some of the inkjets embed some color or other IP into the printer and protocol. This makes a simple "how to lay down dots" protocol rather difficult for some vendors to offer.
- Two-way communication info is important, too.
- Unified driver API?
- The vendors would love to see a single API to write drivers to. L Peter volunteered that gs's driver interface is stable and reasonably battle-tested, but there are other choices (see omni, below, or gimp-print's eventually-to-be-modular structure).
Mark VanderWiele then presented his project, which frankly took most of us by surprise. IBM has over the years written printer drivers for essentially all printers to support OS/2. They are porting this project to Linux and releasing it as free software: probably GPL or perhaps LGPL. The IBM version supports some 600 printers pretty well; the free software version will of course be rather more limited, but does provide a large stable of drivers, printer data, and a rather interesting modular raster driver API structure which seemed quite appealing to the printer vendors, who could plug in dither code, and some Gnome desktop folks, who could skip Ghostscript entirely.
The IBM system is remarkably similar to the structure of my own foomatic database-driven scheme; so much so, in fact, that it may be appropriate for one to absorb the other, or at least for us to collaborate on data exchange. They especially have good data on the more subtle printer characteristics like page sizes, printable regions, color adjustment data, trays, etc. The system provides for completely data-driven "universal" drivers defined via a subclassing mechanism in terms of one another, and for the use of dynamically loadable object code to implement printer or vendor-specific algorithms not already available in the standard codebase.
The system appeared to be limited to raster device support. The system uses Ghostscript as a rendering engine. The system's Linux incarnation has a few bugs yet, but does run. It will be published on IBM's developerworks site next week.
IBM owns one of each printer for testing. This amounts to over 500+ printers for which they even stock supplies; it seems to me that access to this might be a handy thing for free driver developers, although a printer "compile farm" would be a trick and a half to operate - what on earth do you do with the output?
As the grapevine had reported, the API discussion started off with a bang but promptly catapulted itself off a cliff. An extremely basic model of printing was agreed to; IMHO the model is so simple as to be mostly useless, but opinions may vary...
The main problem from the API standpoint is that applications just sort of throw data at lpr and pray. App writers would prefer to be able to do such things as these in a spooler/platform/etc independent way:
- Easily enumerate printers/queues/whatever.
- Easily know and offer to the user printer/driver/etc options.
- Easily know various interesting printer characteristics, like paper sizes, color gamut/space/etc, memory, font capabilities, etc.
- Generate print data (no, not just use sprintf with ad-hoc Postscript, thank you). Both Gnome and KDE have a good story here already: in each the canvas widget's API can be used to construct Postscript.
- Print the data, already.
- Get status back about it.
- Etc, etc, etc.
At this point I asked the obvious: Corel's sysAPS exists now and does much of this stuff, or at least provides a somewhat incomplete quasi-modular implementation of a well defined API which meets many of these needs, so why was it not being considered. The ever-subtle Ben promptly asserted that the sysAPS code sucks.
Much debate ensued over this point throughout the rest of the day. Investigation into the use or adaptation of sysAPS will almost assuredly proceed; something is bound to be useful from it, if we don't settle upon a descendant of the thing.
L Peter then gave a presentation of the issues surrounding Ghostscript drivers. There are four biggies:
- Dynamic Loading
- This is partly a technical issue, since gs is almost absurdly portable. Settling upon a good dynamic loading scheme for so many platforms is a trick. Keith Packard (XFree86) volunteered the MetroLink-contributed OS-independant dynamic loader used in X11 as a choice; there was some discussion of the details, from which it emerged as a fairly suitable choice. There will be discussion etc of loadable drivers on SourceForge soon; volunteer your opinions.
- L Peter has gotten the message that the AFPL is a large deterrent to significant third-party participation in the project. He explained his three-codebase release system (AFPL, Commercial, and GPL with a one rev lag). He explained why; in the past Ghostscript was GPL, and a number of fax vendors made a quick buck off his hard work with only token adherance to the GPL. L Peter also feels, quite sensibly, that developers deserve the bulk of the money for software, while middlemen like distributions, marketing, etc, deserve rather less. He finds that the current free software business models do very poorly at this, and is not convinced that the common support-subsidises-development model is really viable.
Anywho, providing for loadable drivers may amplify the AFPL issue, and L Peter stated that he is willing to make some sort of change to the AFPL to improve the situation.
- Ghostscript is currently somewhat weak in offering support for bidirectional devices. In theory, driver writers can probably mostly do it, or do it with only a few twiddles to things.
- Error handling
- Ghostscript's error handling is somewhat awful from a Unix filter point of view; the errors get mixed in with the data stream, and the error messages tend to be almost ludicrously cryptic. Mike Sweet (CUPS) volunteered to send in a few easy changes to make them go to stderr, which at least fixes part of the problem.
Mixed in with the licensing discussions were some free software project theory discussions. L Peter postulated that most successful free software projects appear to have someone with a strong "gatekeeper" role, acting as benevolent dictator or the like. In Ghostscript's early years, he had felt obligated to perform that role, and felt that his license structure was necessary to his ability to do that. He did so for over a decade, and his close adherance to portability and the Adobe specs is the legacy of his dedication. Unfortunately, very few outside developers have done significant work on Ghostscript; the gs developer community is not really viable. Now, he wishes to retire, and is faced with the problem of quickly constructing a viable non-L Peter Ghostscript development community. Various discussion about how to do this ensued. Netscape/Mozilla was offered as a failed model; X11 going under the X Consortium was offered as a failed model, and no easy fix was decided.
Lunch was out on the patio again. I ate with Mark Fasheh, an intern at VA, Jose (Chema) Celorio of Gnome, Ben Woodard of VA, an unknown female apparently associated with Ben, Waldo Bastian of KDE, and Robert Krawitz of gimp-print. Mark lead in by assuring me that my database is indeed a useful thing, which was nice to hear. We then discussed what sort of spooler interfaces were useful; Chema would be quite happy to toss Ghostscript and spool raw printer bits, while the rest of us pointed out that this had disadvantages: there are an awful lot of those bits in a color photo printjob, and the bits are totally opaque to the spooler, which is the sensible place for an admin to establish policy. Establishing pagecount policies or any sort of policy-based transform or observation on printjob data is essentially impossible in the render-on-the-desktop scheme. Thus there is perhaps a place for both, and certainly no need to eliminate backend filtering.
We also discussed the rather unbaked UPDF specification (my take: PPDs written in XML) and some shortcomings of PPDs (basically no arbitrary user values of any sort).
After lunch, Raph lept in with a PDF 1.4 vs SVG comparison. PDF is basically a swell thing, with a number of new features beyond any other sensible alternative. There is, however, some alarming patent verbiage in the Adobe spec for PDF 1.4 which is rather offputting to free software development. SVG (an XML-based vector graphics format from the W3C intended for the web) may therefore be the most appropriate alternative, although it is missing a disturbing number of features necessary for decent print work, and is rather late in the specification phase, so the needed changes may not be possible for the impending version.
Someone should have an action item to pester Adobe about those patents, but I don't think anything so sensible happened. Presumably Raph will work out the details as part of his Ghostscript PDF 1.4 implementation project.
Owen Taylor of Red Hat gave a presentation on the operation of the Pango library. Pango is an I18N, or internationalization library. It basically allows apps to draw text in the appropriate way using appropriate fonts. Pango provides APIs and code for reordering (ie, left vs right, up vs down, although not for the more exotic schemes like the allegedly spiral (!) forms of Korean) and for shaping (ie, ligatures in English, but more for the complex position-dependent glyph shapes and inter-relationships of other languages like the completely cursive Arabic or Hindi).
Fonts, of course, popped up at various times in the discussion; OpenType appears to be a fine font format for use in the future, but it seems no one actually uses these now. Synchronization between printer fonts and display fonts is a fine trick. A whole font presentation came later...
Hideki Hiura of Sun presented the work of the Li18nux project. The project aims to provide a standard and/or implementation components for good internationalization support on Linux. He drew a fine table which illustrates the solution space for internationalization:
|Singular Context||Multi Context|
|Empty Container Model|| POSIX
|Unicode C. S. normalization|| Java
In this space, a singular context is a scheme which supports one language at a time. Ie, you can draw French or Arabic. A multicontext system supports multiple languages all at once, ie in one string. Current versions of such systems tend to use Unicode character encodings; Unicode is clearly the way to do this. I didn't quite catch the full semantics of the Y axis, but in the Empty Container Model, the application apparently handles many of the details of drawing internationalized text. In the normalization schemes, the system libraries handle much more of the text placement and layout.
The mention of Unicode brought on a summary and discussion of how it works; in short, it's hardly ascii. Unicode uses a variable-length encoding: western characters tend to be fewer bytes in some encodings, and various bizarrely complex multi-char sequences produce such things as overstrikes, annotations, and other things which seem like they don't belong in a character set. [ Dan Kegal mailed me after the summit to point out that it's not really stateful, and that one can be "Level 1" compliant with Unicode by simply using only precomposed characters. He's afraid that people will be scared away from Unicode; this seems unlikely, even if it is complex, since Unicode is really the only game in town. ]
Hideki's project is part of the "Linux Foundation", a body or company of some sort formed from the ashes of the Linux Standards Base and another group. Their goal is to provide for portability across various Linux distributions.
An initial i18n spec for Linux is due out at the impending Linux World. The project webpage is at www.li18nux.net.
Kieth Packard of the XFree86 project offered a presentation on fonts. The problems are many, but essentially, you need all the same fonts, or a good substitution, available in all of:
Kieth instantly solved one facet of the problem by asserting that the X server will in the future no longer handle fonts itself - ie, end of problem. Instead, the application will handle fonts, and send selected glyphs up to the server. The overhead is evidently minor for western languages, and it's a wash for Asian languages. Since this eliminated the age-old problem of X fonts being unavailable on particular servers, no one had any particular objection. As always in X, client library APIs will be extended and compatibility code written, so the world should really be a better place for this change.
As for the other facets of the problem, portions are nearly unsolvable; there is no way to get fonts out of a printer for use on a display, for example. There is no chance of just eliminating printer fonts; they can provide for a twelve times speedup over printing raw bitmaps all the time (a stat from IBM, who would certainly know). There is a seemingly straightforward problem of placing fonts where X, applications, and Ghostscript or other renderers can get to them. And there was some inconclusive discussion of font naming, font formats, and font character-to-glyph mapping (which, derangedly, is totally unstandardized in most common formats!).
Mark Fasheh and Ben of VA gave a short demo of GPR (by Thomas Hubbell). GPR is built atop PPD parsing and Postscript mangling code from CUPS, and provides a cute GUI for fiddling Postscript printer options and selecting the standard transforms from CUPS (n-up, gamma, etc), which are conveniently implemented in the same place.
I felt obligated to point out that this is but one of several equivalent projects; there are XPP, KUPS, and apparently others. Later in the hallway, Waldo agreed -- there are cute little front-ends all over the place -- what we need is more app-to-spooler-to-printer interface (ie plumbing) work.
In this vein, I asked Ben for more details on the notification scheme he mentioned working on. It turns out to be a fairly suitable multicast-based announcement protocol. Interested clients obtain the proper multicast group/port from the spooler, and then listen to job status updates from job monitoring code on the other end of the chain; either in the backend filters, or running as a daemon poking at snmp data on the printer to find out job status. Besides the "obtain the proper multicast group from the spooler" step, the design seems mostly spooler independent, so may be a good thing to fold into other systems.
Assorted small wrapup discussions took place at 5ish.
Raph observed that many of the troubles of the standardized driver API concept could be avoided by simply limiting the API to the actual problem domain: the raster inkjets. No nifty vector or font operations need be represented in the API. This somewhat strengthens the case for IBM's OMNI system, although of course much poking and prodding of that code will be needed to see if we use it as is or just steal.
Nick was volunteered as summit mailing list nagbot. He will nag people to do things.
A next meeting was not formally scheduled, although the idea of a vendorless gathering at the October (?) Atlanta Linux Showcase was proposed. The "opposite" idea of a Linux presence at the upcoming IETF IPP bake-off was shot down; vendors will not have linux-concerned people there, so it would not be useful.
After the official end, we gabbed in clusters in the hallway and eventually I hooked up with Hideki, Waldo, Robert, Raph, Alex, and Owen for a dinner of seafood and discussion. Topics ranged from printing (of course) to the usual selection of geek toys: laptops, monitors, computers, old computers, window managers, etc.
With any luck at all, tomorrow's flight home will be totally uneventful. I had quite enough events getting here :)
Day 3: United Fails Again
After my flight out, I had rather expected, statistically speaking at least, to have a flawless trip back. Not so. The initial flight from SJ to Chicago was slightly delayed when an oxygen mask deployed for no reason. Once on the ground in Chicago, we parked for an hour before a gate opened up; once we got to a gate a jetway driver could not be located. Fortunately (?) my connecting flight was delayed because a pilot could not be located, so after running across O'Hare like a madman I just made it.
On the bright side, before setting out on that adventure, I chatted for a bit with Mark Pruett (the newly entitled "printing discipline leader" or somesuch in VA's services division) about life at VA, printing, and the Alpha Linux-based distributed shared memory system driving Virginia Power's grid monitoring system.