Full Text Search: The Key to Better Natural Language Queries for NoSQL in Node.js
Microsoft Office documents not only offer an easy and effective way for users to communicate and collaborate, but they also get smarter with each new release and increase connectivity with the remainder of the software universe. In some cases, though, Office applications can't replace the flexibility of custom code. Hence, the need to create and edit Office documents (mostly Word and Excel documents) programmatically exists—particularly from .NET applications.
Depending on your skills, this task can be approached from two diverse directions. You either use .NET to write GUI or console applications that manipulate documents, or leverage the true power of the .NET Framework to embed managed functions inside existing Office documents in a way that vaguely resembles a VBA macro. This article tackles both these techniques, beginning with the second approach.
Thanks to the Visual Studio Tools for Office (VSTO), building a Word document or an Excel spreadsheet that is bound to managed code is relatively easy. Once you have installed VSTO, a new node appears in the Visual Studio project window to let you create a new type of output. In particular, you can choose to create Excel or Word documents and Word templates.
Using VSTO has many advantages. First off, you can use all the features of Visual Studio .NET and write Office-based applications in Visual Basic .NET or C#. Needless to say, both languages are much more powerful than VBA and they open you up to the whole .NET Framework of classes.
A VSTO application also introduces a strong security concept to the world of Office—far beyond what you can get through VBA. A VSTO solution won't execute unless all the involved assemblies are explicitly full-trusted.
Creating a project with VSTO configures the working environment properly as well, adding any needed reference to the appropriate Office-related executables. What you have to do is simply open the code-behind class that's there and hook up handlers for the document's core events such as open and close.
Basics of a VSTO Application
A VSTO application is a piece of .NET code that runs inside a particular Office document. Users typically open the document and find some UI widgets to play with: dialogs or graphics. They work, update the document in a WYSIWYG fashion, and then save the results to another document (if you choose).
When you open the Office document, you find references to external .NET assemblies, the assemblies are loaded, and the _Startup function in your custom code is called. The _Startup function is defined in the auto-generated code section that VSTO inserts in the code-behind class for the document.
The code you write handles just events raised by the Word and Excel objects (such as an application or document). It does that through the standard COM event marshaling and doesn't even interfere with VBA events. (This means that VBA-equipped documents can be further extended to .NET.)
An event handler can access any .NET subsystem (for example run a query, get some data, and create a chart or perhaps populate a Word template with data coming out of a Web service).
Creating Documents Manually
What if you have a .NET application and need to automate the creation of Word or Excel documents? You need to incorporate the Office model into your existing assemblies. The code in this section illustrates how to create a Word document programmatically from within a console application.
As the first step, you add a couple of specific references to the Office and Word (or Excel) COM object model. The interop infrastructure of .NET wraps those COM libraries in a .NET assembly for you:
using Office = Microsoft.Office.Core; using Word = Microsoft.Office.Interop.Word;
When working with Word or Excel object models, almost everything must be an object and you must take optional parameters into account. The .NET Framework doesn't support optional parameters—there's method overloading in .NET—but defines a value that should be used in lieu of parameters you don't want to specify when making calls to COM objects. The following code creates a new instance of Word and a new document object:
object Missing = System.Reflection.Missing.Value; object fileToOpen = (object) @"c:\sample.doc"; object fileToSave = (object) @"c:\sample1.doc"; Word.Application app = new Word.ApplicationClass(); Word.Document doc = new Word.Document;
The next step is opening the document. You can proceed as follows, limiting the non-null parameters to the file name:
doc = app.Documents.Open(ref fileToOpen, ref Missing, ref Missing, ref Missing, ref Missing, ref Missing, ref Missing, ref Missing, ref Missing, ref Missing, ref Missing, ref Missing, ref Missing, ref Missing, ref Missing, ref Missing);
What you do next depends on what you want to obtain. For this example, add some text to existing bookmarks. A bookmark is a well-known place in the document whose content can be changed programmatically. In a Word document, you create a bookmark by selecting the placeholder text and clicking on Insert|Bookmark. Imagine your sample document contains a bookmark named LetterDate that you expect to assign with the current date. You modify the content of a bookmark through a Range object:
object bookmarkName = null; bookmarkName = (object) "LetterDate"; Word.Range rng = doc.Bookmarks.get_Item(ref bookmarkName).Range;
You store the new text in the Range object and re-add the bookmark to the Bookmarks collection. If the bookmark already exists, it will be replaced:
rng.Text = DateTime.Now.ToShortDateString(); rng.Font.Bold = 1; rng.Font.Color = Word.WdColor.wdColorBlue; object oRng = rng; doc.Bookmarks.Add("LetterDate", ref oRng);
The current date will be added in bold and blue. Following this approach, you can create a Word template, add as many bookmarks as needed, and factor out a .NET class to let you prepare and save Word documents automatically.