PDF files have become part and parcel of being able to use a computer productively. PDF (Portable Document Format) is a file format used to present documents independent of application software, hardware, and operating systems. A PDF file contains a complete description of a fixed-layout flat document, as well as fonts, text, and graphics. Today, I will show you how to convert images to a PDF document and how to combine two PDF documents into one.
PDFSharp
PDFsharp is an Open Source library that creates PDF documents from any .NET language. PDFSharp can use either GDI+ or WPF and it includes support for Unicode in PDF files. Download PDSharp here and you can find a few samples here to get you started.
Our Project
Create a new Visual Basic Windows Forms project. Feel free to name it anything you like, but keep in mind that my objects may be named differently. All you need to add to the form is a BackgroundWorker object. Make sure that you have downloaded PDFSharp and add a Reference to it in the Project, Add References box, as shown in Figure 1:
Figure 1: References
Add the following namespaces to your code:
Imports System.Windows.Forms Imports System.Configuration Imports System.IO Imports PdfSharp.Drawing Imports PdfSharp.Pdf Imports PdfSharp.Pdf.IO Imports System.Threading
The references to PdfSharp are necessary because we will be using capabilities from each library in our project. We reference it here as well so that we can properly make use of each library inside our project. Add the following private variable:
Private _sync As SynchronizationContext = _ SynchronizationContext.Current
The SynchronizationContext object will allow one thread to communicate with another thread. Have a look here for a better explanation.
Add the following code into your Form’s Load event:
Private Sub frmPOD_Load(sender As Object, e As EventArgs) _sync = System.Threading.SynchronizationContext.Current worker.RunWorkerAsync() End Sub
The BackgroundWorker object gets started in the Form Load event. Add the following code:
Private Sub worker_DoWork(sender As Object, e As DoWorkEventArgs) Try For Each file As String In Directory.GetFiles _ ("C:\PDF Files To Import") ConvertPDF(file.ToString(), "C:\File.pdf") CombinePDF(file.ToString(), "PDF File2.pdf") Next Catch ex As System.Exception LogError(ex) End Try End Sub
The DoWork event loops through a supplied directory (C:\PDF Files To Import) and calls the ConvertPDF and CombinePDF subs for each file in the directory, respectively. Add the ConvertPDF sub procedure now:
Private Sub ConvertPDF(_srcFile As String, _destFile As String) Try Dim srcFile = _srcFile Dim destFile = (Convert.ToString(Path.GetDirectoryName(srcFile) + _ "\") & _destFile) + ".pdf" Dim doc As New PdfDocument() doc.Pages.Add(New PdfPage()) Dim xgr As XGraphics = XGraphics.FromPdfPage(doc.Pages(0)) Dim img As XImage = XImage.FromFile(srcFile) xgr.DrawImage(img, 0, 0) img.Dispose() doc.Save(destFile) doc.Close() LogError(Nothing, "PDF Created: " + _ Path.GetFileName(destFile).ToString()) Catch ex As Exception LogError(ex) End Try End Sub
The ConvertPDF sub procedure takes two arguments: one for the source file, which will be converted to a PDF document; and the resulting file, which will be saved at the same location as the input document. A PdfDocument gets created and instantiated. A Page gets added to the newly created PDF document. The XGraphics object draws an XImage object, which hosts the image file that was supplied onto the PDF page. Lastly, the PDF document gets saved.
Now, add the CombinePDF sub:
Private Sub CombinePDF(ExistingPDF As String, NewPDF As String) Try Dim outputDocument As PdfDocument = PdfReader.Open(NewPDF) Dim inputDocument As PdfDocument = PdfReader.Open(ExistingPDF, _ PdfDocumentOpenMode.Import) Dim count As Integer = inputDocument.PageCount For idx As Integer = 0 To count - 1 Dim page As PdfPage = inputDocument.Pages(idx) outputDocument.AddPage(page) Next outputDocument.Save(NewPDF) LogError(Nothing, "PDF Created: " + Path.GetFileName(NewPDF)) Catch ex As Exception LogError(ex) End Try End Sub
The CombinePDF sub takes two arguments as well: one for the input document and one for the output document. First, the Output document gets opened up in memory. An Input document then gets created and indicates that it will be used as an Import document. A Loop gets created and it loops through all the pages inside the input document and adds them one by one into the output document. Lastly, the resulting document gets physically saved under a supplied name.
You will notice that, inside the Catch block, a sub procedure named LogError gets called. Let’s add that now:
Private Sub LogError(ex As Exception, Optional msg As String = "") If ex Is Nothing Then Dim message As String = String.Format("Time: {0}", _ DateTime.Now.ToString("dd/MM/yyyy hh:mm:ss tt")) message += Environment.NewLine message += String.Format("Message: {0}", msg) Dim path As String = ErrorLogLocation & _ Convert.ToString("\PDF_ErrorLog.txt") Using swriter As New StreamWriter(path, True) swriter.WriteLine(message) swriter.Close() End Using Else Dim message As String = String.Format("Time: {0}", _ DateTime.Now.ToString("dd/MM/yyyy hh:mm:ss tt")) message += Environment.NewLine message += "-----------------------------------------------------------" message += Environment.NewLine message += String.Format("Message: {0}", ex.Message) message += Environment.NewLine message += String.Format("StackTrace: {0}", ex.StackTrace) message += Environment.NewLine message += String.Format("Source: {0}", ex.Source) message += Environment.NewLine message += String.Format("TargetSite: {0}", ex.TargetSite.ToString()) message += Environment.NewLine message += "-----------------------------------------------------------" message += Environment.NewLine Dim path As String = ErrorLogLocation & _ Convert.ToString("\PDF_ErrorLog.txt") Using writer As New StreamWriter(path, True) writer.WriteLine(message) writer.Close() End Using End If End Sub
This sub serves a dual purpose. The first is obviously to log a detailed exception and the other is simply to display a message that gets supplied.
Conclusion
PdfSharp is very powerful and easy to use with the .NET Framework, as you can see. It makes any file conversion into a PDF format very quick and easy.