A PDF Solution for All Programming Platforms

Let's face it: Printing with the Windows WIN32 SDK can be nightmarish at best, and MFC adds only a little bit of help for this problem. Programming guru Jonathan de Halleux described printing as "the most mysterious feature in MFC."

Traditionally, programmers have worked around the printing issue in a couple of clever ways. The first is printing plain ASCII text files or HTML files and handling them through the Web browser. The second, using Crystal Reports or another third-party add-in report system, applies to more complex layouts. However, a lot of border cases don't neatly fit into one category or the other. For example, you may need a more complicated layout than HTML generally allows (such as drawing a line to connect two objects on the page), but without the sort of data repetition that is the forté of a report writer.

For these reasons and many more, programmers are attracted to the Adobe Portable Document Format (PDF) like moths to a porch light. Customers love PDF printing because it always comes out the same; it works great as an e-mail attachment, et cetera, et cetera. However, to begin writing raw PDF files, you need a pretty solid working knowledge of PostScript—and that's not pretty. In reality, you need a complete PDF library, such as PDFlib (from PDFlib Gmbh), that handles all your document-creation needs.

PDFlib: A Solution for All Programming Platforms

The PDFlib main package for Windows works natively with C, C++, Java, and PHP applications. You also can reach other programming platforms such as Perl AS, Python, Tcl, Ruby, REALbasic, and Borland C++ Builder through optional add-on packages. Furthermore, you can choose the Windows COM/.NET Installer and make PDFlib available throughout your .NET regime (for example, C#, VB.NET, ASP.NET, and related language solutions).

PDFlib also has an impressive array of support available for other operating system platforms, including Linux (x86, AMD64, PPC, and eServer), Mac OS 9 and OS X, and every type of Unix you can think of (FreeBSD, OpenBSD, HP-UX, Solaris, and AIX, just to name a few).

Additionally, PDFlib supports all the extras that you probably had given up hope of incorporating in your own PDFs:

  • Layers: As found in CAD documents but also useful for multilingual documents and interactive control of document display
  • Unicode: Complete Unicode support from end to end, including file names, page content, hyper links, and form fields
  • Text formatting: Built-in textflow formatter for simple arrangement of ragged or justified text, font changes, multi-line body text, and very large tables
  • Images: Support for all types of TIFF, including JPEG compressed TIFFs, JPEG 2000, and 48-bit color PDF generation
  • Multimedia tags: Allows "read aloud", page reflow, and simpler export of data
  • PDF/X and OPI: For anyone producing documents to be printed in "book quality"
  • Linearized PDF: optimized for page-at-a-time Web viewing

Getting Started

Perhaps because PDFlib has been ported so widely, integration is about as simple as you can imagine. I started off by downloading a free unlicensed evaluation version of PDFlib for Windows (6.0.3). A barebones version called PDFlib Lite also is available, which you can freely use for non-profit personal use. The Lite version can generate basic PDFs but does not edit them or include advanced font techniques (see this comparison chart for a feature breakdown).

After extracting the ZIP file, you basically just include the pdflib.h include file in your C/C++ application. Next, you add the supplied pdflib.lib import library to your Link step, and then make the pdflib.dll available in your application's path at runtime. The DLL itself is just under 2 Mb and is completely self-contained (calling only native Win32 SDK DLLs). If you dislike import libraries or you want to make PDFlib an optional component, you can use the helper macros in pdflibdl.h to dynamically load the library entry points.

Writing Your First PDF

This section demonstrates a simple PDF creation C++ application that reads a template PDF file ("boilerplate.pdf"), fills in some data, and burns a new PDF file as its output. Specifically, it fills in the missing fields of a conventional business card.

It will turn this:

into this:

All the logic is enclosed in a try...catch construct to watch for PDFlib::Exception exceptions:

 1 // $Id: businesscard.cpp,v 1.16 2004/05/23 11:22:06 tm Exp $
 2 #include <iostream>
 3 #include "pdflib.hpp"
 4
 5 int main(void)
 6 {
 7    try {
 8    PDFlib p;
 9    int         i, blockcontainer, page;
10    const string infile = "boilerplate.pdf";
11       /* This is where font/image/PDF input files live.
          * Adjust as necessary.
12        *
13        * Note that this directory must also contain the
14        * LuciduxSans font outline and metrics files.
15        */
16    const string searchpath = "../data";
17    struct blockdata {
18       blockdata(string n, string v): name(n), value(v){}
19       string name;
20       string value;
21    };
22
23    blockdata data[] = {
24       blockdata("name",                   "Victor Kraxi"),
25       blockdata("business.title",         "Chief Paper Officer"),
26       blockdata("business.address.line1", "17, Aviation Road"),
27       blockdata("business.address.city",  "Paperfield"),
28       blockdata("business.telephone.voice","phone +1 234 567-89"),
29       blockdata("business.telephone.fax", "fax +1 234 567-98"),
30       blockdata("business.email",         "victor@kraxi.com"),
31       blockdata("business.homepage",      "www.kraxi.com"),
32    };
33
34 #define BLOCKCOUNT (sizeof(data)/sizeof(data[0]))

The guts of the program really starts on Line 36, where it creates a new PDF document session with the begin_document() method call. As you might expect, the scope of a document is from begin_document() to end_document(). The begin_document() call can specify a host of options, including passwords, permissions (noprint, nomodify, nocopy, and so forth), output file (for example, "businesscard.pdf"), and PDF version, just to name a few.

You can create PDF documents in-memory by omitting the filename (""):

35
36       if (p.begin_document("businesscard.pdf", "") == -1) {
37       cerr << "Error: " << p.get_errmsg() << endl;
38          return(2);
39       }
40
41    // Set the search path for fonts and PDF files
42    p.set_parameter("SearchPath", searchpath);
43
44    // This line is required to avoid problems on Japanese systems
45    p.set_parameter("hypertextencoding", "host");
46

Next, you set standard PDF info items that you can view from inside Adobe Acrobat, for example:

47    p.set_info("Creator", "businesscard.cpp");
48    p.set_info("Author", "Thomas Merz");
49    p.set_info("Title","PDFlib block processing sample (C++)");
50

A PDF Solution for All Programming Platforms

Now you are ready to use a PDF Import function ("pdi") to import your template file. Note that this code won't link on PDFLite as this is a PDFlib function:

51       blockcontainer = p.open_pdi(infile, "", 0);
52       if (blockcontainer == -1) {
53       cerr << "Error: " << p.get_errmsg() << endl;
54          return(2);
55       }
56

Your stuff belongs on the first page, so fetch that page and make it active for writing:

57       page = p.open_pdi_page(blockcontainer, 1, "");
58       if (page == -1) {
59       cerr << "Error: " << p.get_errmsg() << endl;
60          return(2);
61       }
62
63       p.begin_page_ext(20, 20, "");    // dummy page size
64
65       // This will adjust the page size to the block container's size.
66       p.fit_pdi_page(page, 0, 0, "adjustpage");
67

Now, you use the block filling functions, part of the PDFlib Personalization Server (PPS) that allows you to handle variable data blocks of type Text, Image, and PDF. Each data block is addressed by name (for example, "business.telephone.fax"). In this simplistic scheme, you have all of this just sitting in a static structure (go back to Line 23 for details):

68       // Fill all text blocks with dynamic data
69       for (i = 0; i < (int) BLOCKCOUNT; i++) {
70          if (p.fill_textblock(page, data[i].name, data[i].value,
71          "embedding encoding=host") == -1) {
72          cerr << "Error: " << p.get_errmsg() << endl;
73          }
74       }
75

Because you are now fairly deep into the document, you must remember to finish the page, close the imported document page, finalize your own document, and then close the imported file handle:

76       p.end_page_ext("");
77    p.close_pdi_page(page);
78
79       p.end_document("");
80    p.close_pdi(blockcontainer);
81    }
82
83    catch (PDFlib::Exception &ex) {
84       cerr << "PDFlib exception occurred in businesscard sample: "
         << endl;
85       cerr << "[" << ex.get_errnum() << "] " << ex.get_apiname()
86       << ": " << ex.get_errmsg() << endl;
87    return 99;
88    }
89
90    return 0;
91 }

Now, you can open or print your output file (businesscard.pdf) in Adobe Acrobat, e-mail it to a customer, burn it to a CD, or upload it to an online printer.

For Further Reading

Beginning PDF Programming with PHP and PDFlib
by Ron Goff
Foreword by Thomas Merz

ISBN 0973589841

Although written with PHP developers in mind, this book covers all the basics of the PDFlib API, which is essentially platform-independent. I highly recommend it for anyone who is starting off with PDFlib development or curious about what it can really do.

[PDF3.jpg]

Meet the Whole Family

You truly have only scratched the surface of what PDFlib offers. The programming API is also part of a larger product family:

  • PDFlib contains all functions required to create PDF output containing text, vector graphics and images, plus hypertext elements.
  • PDFlib+PDI includes all PDFlib functions, plus the PDF Import Library (PDI) for including pages from existing PDF documents in the generated output.
  • PDFlib Personalization Server (PPS) includes PDFlib+PDI, plus additional functions for automatically filling PDFlib blocks. Blocks are placeholders on the page that can be filled with text, images, or PDF pages. They can be created interactively with the PDFlib Block Plugin for Adobe Acrobat (Mac or Windows) and will be filled automatically with PPS. The plugin is included in PPS.

About the Author

Victor Volkman has been writing for C/C++ Users Journal and other programming journals since the late 1980s. He is a graduate of Michigan Tech and a faculty advisor board member for Washtenaw Community College CIS department. Volkman is the editor of numerous books, including C/C++ Treasure Chest and is the owner of Loving Healing Press. He can help you in your quest for open source tools and libraries; just drop an e-mail to sysop@HAL9K.com.



About the Author

Victor Volkman

Victor Volkman has been writing for C/C++ Users Journal and other programming journals since the late 1980s. He is a graduate of Michigan Tech and a faculty advisor board member for Washtenaw Community College CIS department. Volkman is the editor of numerous books, including C/C++ Treasure Chest and is the owner of Loving Healing Press. He can help you in your quest for open source tools and libraries, just drop an e-mail to sysop@HAL9K.com.

Comments

  • There are no comments yet. Be the first to comment!

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • Live Event Date: December 11, 2014 @ 1:00 p.m. ET / 10:00 a.m. PT Market pressures to move more quickly and develop innovative applications are forcing organizations to rethink how they develop and release applications. The combination of public clouds and physical back-end infrastructures are a means to get applications out faster. However, these hybrid solutions complicate DevOps adoption, with application delivery pipelines that span across complex hybrid cloud and non-cloud environments. Check out this …

  • CentreCorp is a fully integrated and diversified property management and real estate service company, specializing in the "shopping center" segment, and is one of the premier retail service providers in North America. Company executives travel a great deal, carrying a number of traveling laptops with critical current business data, and no easy way to back up to the network outside the office. Read this case study to learn how CentreCorp implemented a suite of business continuity services that included …

Most Popular Programming Stories

More for Developers

RSS Feeds