PDF as Standard Print Job Format

Contents

Introduction

One of the decisions which was made on the OSDL Printing Summit in Atlanta in 2006 and widely accepted by all participants was to switch the standard print job transfer format from PostScript to PDF. This format has many important advantages, especially

  • PDF is the common platform-independent web format for printable documents
  • Portable
  • Easy post-processing (N-up, booklets, scaling, ...)
  • Easy Color management support
  • Easy High color depth support (> 8bit/channel)
  • Easy Transparency support
  • Smaller files
  • Linux workflow gets closer to Mac OS X

Most important here is the post-processing. In contrary to PostScript, one can easily distinguish in every PDF file which part of the data belongs to which page. So one can easily take the pages apart and do things like printing selected pages, 2, 4, ... pages per sheet, even/odd pages for manual duplex, scaling, ... PostScript files must be strictly DSC-conforming to allow this kind of page management. By using PDF we assure that page management always works.

Development

The implementation of PDF as standard print job format is completed.

All important upstream desktop application projects (GTK/GNOME, Qt/KDE, Firefox/Thunderbird, LibreOffice/OpenOffice.org) have switched to emit print jobs in PDF format. The few applications still sending PostScript are less important programs which do not use standard libraries for print job generation. A PDF-enabled CUPS setup turns incoming PostScript to PDF as first step in such a case.

The filter set to switch CUPS 1.5.x or earlier from the original PostScript-based workflow to the PDF-based workflow or to be able to print with filters and drivers at all from CUPS 1.6.x on (here there will be no PostScript-centric workflow any more) is the OpenPrinting CUPS Filters package. It needs to get installed after installing CUPS (on CUPS 1.5.x and earlier some of CUPS' original filters get replaced) and then the PDF-based workflow is up and running.

Note that from CUPS 1.6.x on CUPS does not ship the filters for non-Mac-OS-X use any more and therefore the OpenPrinting CUPS Filters package is required then. This will finally make the PDF-based printing workflow standard in all Linux distributions.

If you want to test the PDF printing workflow without changing your system, try

Ubuntu Intrepid or newer

Live CDs are available. Also Debian has switched to the PDF printing workflow, as most of its printing-related packages are the same as their Ubuntu counterparts or at least similar.

How to switch a system to use PDF as standard print job format

NOTE: Debian and Ubuntu have already switched to the PDF printing workflow.

Add the new PDF filters to CUPS

CUPS needs the following additional filters to make print job being processed in a PDF printing workflow, with page management being done by pdftopdf and not pstops any more:

  • imagetopdf
  • texttopdf
  • pstopdf
  • pdftopdf
  • pdftoraster
  • pdftoijs
  • pdftoopvp
  • pdftops (enhanced)

All these filters (and also the CUPS filters for non-Mac-OS-X systems which Apple stopped to maintain) are in the OpenPrinting CUPS Filters package now.

To add the filters to CUPS, download the current version of the package, uncompress it, and compile and install it as described in its INSTALL.txt file. You will need Poppler (0.18.0 or newer), libjpeg, libpng, libtiff, and libijs for building it. Note that if you use the libraries of your Linux distribution that you also install the appropriate "-dev" or "-devel" packages with the C header files. The installation will replace some files of CUPS, but this will not remove any features, printing will simply work PDF-based now.

The filters in OpenPrinting CUPS Filters are Poppler-based, due to Poppler's more flexible API. Note that Poppler 0.18.0 or newer is needed.

pdftoraster is available in two flavors, once Poppler-based in OpenPrinting CUPS Filters and second, Ghostscript-based coming as part of Ghostscript from version 8.64 on (in newer Ghostscript versions pstoraster and pdftoraster are unified to gstoraster). Currently, the Ghostscript-based version is the more recommended one, as Ghostscript is more optimized for printing and has probably also a more complete implementation of color management. The filter gets installed together with Ghostscript.

Please report bugs in the OpenPrinting CUPS Filters package on our bug tracker, assign them to the product "OpenPrinting" and the component "cups-filters".

The development can be followed on our BZR repository, see the contributor instructions for how to contribute.

 

Modify the cost factors of already existing file conversion rules in CUPS

Every file format conversion rule in CUPS (in the /etc/cups/*.convs and /usr/share/cups/mime/*.convs files, also *cupsFilter lines in the PPDs) has not only an input and an output format and a filter name, but also a numerical cost factor. The cost factors of each filter chain computed by CUPS will be summed up and if there is more than one possible filter chain for a job, CUPS takes the "cheapest" one.

To make sure that the PDF-based way is always preferred, we raise the cost factor of the pstops filter from 66 to 100 in /etc/cups/mime.convs (or /usr/share/cups/mime/mime.convs):

sed -i -r -e '/\spstops$/ { s/66/100/ }' /usr/share/cups/mime/mime.convs

All other relevant conversion rules are in the conversion rule files coming with the new CUPS filters and their cost factors are already low enough.

If you want to use the PDF workflow in principle but make the one exception of the input being PostScript and the printer being a native PostScript printer, so that a PS -> PDF -> PS conversion gets avoided, set the cost factor of pstops to 65. This forth-and-back conversion sometimes produces PostScript files which are too big for the printer's memory resulting in not getting printed. In Ubuntu and Debian this exception is used.

Update Foomatic to version 4.0.x

foomatic-rip 3.x does not understand PDF as input format. This feature was added as a principle feature of foomatic-rip 4.0. foomatic-rip 4.0 feeds PDF directly into the Ghostscript process which renders the input into printer's format together with the driver, at least if Ghostscript is called directly in the beginning of the rendering command line and no option requires PostScript commands to be inserted into the data stream to get executed. Otherwise, foomatic-rip 4.0 converts the input into PostScript at first.

The new foomatic-db-engine has extensions for the PDF workflow in its PPD generator. Especially PPDs are generated with the lines

*cupsFilter:    "application/vnd.cups-postscript 100 foomatic-rip"
*cupsFilter:    "application/vnd.cups-pdf 0 foomatic-rip"

instead of

*cupsFilter:    "application/vnd.cups-postscript 0 foomatic-rip"

now, so that foomatic-rip accepts PDF as input format with this PPD. It also accepts the new "<prototype_pdf>...</prototype_pdf>" tag in the "<execution>" section of Foomatic driver XML files to allow specifying a separate command line prototype for PDF input. From this tag the PPD generator creates the new "*FoomaticRIPCommandLinePDF" keyword in the PPD files.

The new foomatic-db contains several optimizations for the PDF workflow, especially many options which worked by inserting PostScript code into the data stream were converted to Ghostscript command line options, so Ghostscript can take PDF as input with the same drivers.

Download the foomatic-filters, foomatic-db-engine, foomatic-db and (optionally) the foomatic-db-nonfree package from our download area, as released tarballs or daily snapshots or from our BZR repositories. See our Foomatic page for more information.

Patch ready-made PPDs using foomatic-rip to accept PDF as input format

There are many PPD files on a typical system which are not generated by Foomatic, but they use also foomatic-rip and so they (and their corresponding drivers) can be used with PDF input data. These PPDs come with driver packages, or as manufacturer-supplied PPDs.

To convert them all to accept PDF run the following commands:

cd /usr/share/ppd  # (or cd /usr/share/cups/model)
find . -name "*.ppd" | xargs perl -p -i -e \
's,^\*cupsFilter:\s*\"application/vnd.cups-postscript\s+0\s+foomatic-rip\",*cupsFilter: "application/vnd.cups-postscript 100 foomatic-rip"\n*cupsFilter: "application/vnd.cups-pdf 0 foomatic-rip",'
for f in `find . -name '*.ppd.gz'`; do gzip -cd $f | perl -p -e \
's,^\*cupsFilter:\s*\"application/vnd.cups-postscript\s+0\s+foomatic-rip\",*cupsFilter: "application/vnd.cups-postscript 100 foomatic-rip"\n*cupsFilter: "application/vnd.cups-pdf 0 foomatic-rip",' | \
gzip -9 > $f.tmp; rm $f; mv $f.tmp $f; done

All PPD files on OpenPrinting are already appropriately changed.

Make desktop applications emitting PDF when printing

From Ubuntu Oneiric on, all standard applications emit PDF when printing.

For KDE and Qt applications you only need a not too old Qt (4.x) and CUPS 1.2.x or newer. Then Qt emits print jobs in PDF format. So for these applications the PDF printing workflow is already upstream reality.

GTK and GNOME also switched over to PDF output. If you have still an old version, this patch to GTK can at least switch some GTK/GNOME applications to emit print jobs in PDF.

OpenOffice.org/LibreOffice, Firefox and Thunderbird also have switched over. Update if you have still old versions.

If not all applications are sending PDF to CUPS, you will still get printouts and you will already get the advantages of doing page management on a PDF data stream. CUPS simply converts incoming PostScript using the pstopdf filter before doing any further step.

Related bugs and feature requests

The reports listed here are once feature requests to several free software projects for implementing the PDF printing workflow and also bug reports about problems concerning the PDF printing workflow. Note that many issues marked as fixed are only fixed in the development branches of the project's source code and only get available with the next release.

They are all done now. The PDF-based printing workflow is implemented.

  • CUPS STR #3930: Moving of CUPS filters not used by Mac OS X to OpenPrinting (FIXED)
  • CUPS STR #2897: Request for adding PDF filters to CUPS. This has not been done, instead, OpenPrinting is hosting the PDF filters together with the other CUPS filters now.
  • Ghostscript bug #690032: bjc600/bjc800 drivers do not work with PDF input data (FIXED)
  • Ghostscript bug #690101: CUPS Raster ("cups") output device does not work with PDF input (FIXED)
  • Ghostscript bug #688036: ICC V4 support request for Ghostscript (FIXED)
  • Poppler bug #17499: Color Management feature request for Poppler (FIXED)
  • Ubuntu bug #258421: Patch to make GNOME/GTK applications sending print jobs in PDF (FIXED)
  • GNOME bug #560177: Feature request to switch the print job output of GTK Print to PDF (FIXED)
  • GNOME bug #585442: Feature request to switch Evince's print job output to PDF (FIXED)
  • Mozilla bug #462872: Feature request to switch the print job output of Firefox/Thunderbird to PDF (FIXED)
  • OpenOffice.org issue #94173: Feature request to switch the print job output of OpenOffice.org to PDF (FIXED)
  • OpenPrinting bugs: Most of these bug reports and feature requests are about the new Foomatic 4.0

History

Most of the filters for the PDF-based printing workflow were originally hosted in the Subversion repositories of the OpenPrinting Japan SourceForge site http://sourceforge.jp/projects/opfc and made available as an add-on package for CUPS. This add-on package is now replaced by OpenPrinting CUPS Filters which is the official upstream development place for all these filters.

The foomatic-rip filter (not in OpenPrinting CUPS Filters) was made PDF-ready by Lars Uebernickel due to an internship at OpenPrinting, mentored by Till Kamppeter and funded by the Linux Foundation.

All these filters got integrated in Ubuntu from Intrepid on. This way the PDF workflow got implemented on the printing system/server side.

The imagetopdf, pdftopdf, pdftoraster (Poppler-based), and pdftoopvp filters are written by Koji Otani (BBR Inc., Japan, sho at bbr dot jp) and hosted at the OPFC project at SourceForge Japan.

The texttopdf and pdftoijs filters are written by Tobias Hoffmann (th55 at gmx dot de) as a Google Summer of Code project, mentored by Hin-Tak Leung (hintak_leung at yahoo dot co dot uk). They are also hosted at the OPFC project at SourceForge Japan.

The pstopdf filter was originally written by Robert Sander (robert dot sander at epigenomics dot com) and improved for the PDF printing workflow by Till Kamppeter and Johan Kiviniemi (debian at johan dot kiviniemi dot name).

The former cpdftocps filter converted CUPS PDF (PDF which was passed through the pdftopdf filter) to CUPS PostScript (PostScript which was passed through pstops, expected as input by native PostScript PPDs without *cupsFilter lines or legacy PPDs). It was a script which called the Poppler filter pdftops to convert to PostScript and then pstops to insert the PostScript code of the PPD options. As the page management options of CUPS were already applied by pdftopdf, on the call of pstops from within this filter these options are filtered out of the command line. This filter can be considered as a "PostScript printer driver", as it generates the PostScript needed by the PostScript printers. The cpdftocps filter is written by Johan Kiviniemi (debian at johan dot kiviniemi dot name) and improved by Till Kamppeter. In the OpenPrinting CUPS Filters package this functionality is merged into the pdftops CUPS filter using Poppler or Ghostscript  depending on how it got compiled. The merge was also done by Till Kamppeter.

The Ghostscript-based pdftoraster filter is written by Till Kamppeter, who also overtook maintenance of Ghostscript's CUPS Raster output device, especially he has fixed the bugs which came up when feeding Ghostscript with PDF as input format. Note that in contrary to pstoraster this filter is written in C. This is needed to make use of the CUPS libraries to read out the PostScript code defined in the PPD file which is supposed to get inserted into the PostScript input data stream in order to let Ghostscript generate the correct CUPS Raster data according to the option settings supplied by the user. This is not possible with PDF input data. The filter then converts the PostScript code into equivalent command line options for the Ghostscript call and then calls Ghostscript. pstoraster only needs to call Ghostscript, as pstops has already embedded the PostScript code in the PostScript input stream then. Therefore pstoraster can be a simple shell script. As pdftoraster does not modify the input data stream, it works as well with a PostScript input data stream.

Richard Hughes merged Ghostscript's pstoraster and pdftoraster filters into one gstoraster filter written in C and added colord-based color management to it.

Groups: