Creating Word Documents with PowerShell

Background

For the last few months I have been working on a course called “Introduction to Windows PowerShell” for Pragmatic Works. This has been a challenging and rewarding task as I have found that you learn something best when you teach it to someone else.

Part of the deliverables for this course was providing labs for students to walk through on their own. Also for each lab, there is a solution that I provide in case they get stuck. On the first pass I had placed all of these labs and solutions in a separate .ps1 file, but after further review it was going to be better if they were all in a single Word Document. This however meant combining roughly 20 files into 2 Documents.

Being somewhat crunched for time, I decided to see if I could automate feeding these .ps1 files into a Word Document using PowerShell.

Methodology

My gut response for the last few years has been to automate  as many things as possible using PowerShell. If this could not be done entirely with PowerShell, at least I would try and do as much as possible.

In order to do this I had to learn a little bit:

  • Are there any CLI options for Word?
  • If not, are there any .NET Classes exposed for Word?
  • And if either of these first two were true could  I produce a new script?

.NET Classes for Microsoft Word

Upon doing some digging I found a couple forum posts on manipulating files using PowerShell (StackOverflow, MSDN ). What caught my eye from these posts is that they both used ‘Word.Application’ as a COM object. Looking into this, led me to information about Word’s VBA interface for scripting (MSDN – VBA). The VBA scripting  wasn’t exactly what I was looking for, but I figured I would keep digging. Finally I found the gold at the end of the rainbow. The entire .NET namespace documentation for Office 2013+  (MSDN – Namespace). There is a lot that this documentation covers, but I will just walk through the basics of creating a file, formatting the text, adding text from other files, and then finally saving the document.

Creating a File

In order to create a file you must have an instance of Word in your PowerShell Session. In order to do this you can use the New-Object cmdlet and use the -ComObject parameter as shown below:

PS C:\WINDOWS\system32> $WordApplication = New-Object -ComObject "word.application"

This starts a instance of Word, which you can view in processes, but does not open a visual window. If you want to see the window you can do the following:

PS C:\WINDOWS\system32> $WordApplication.Visible = $true

If you look at Word, it is now visible, but no document is open. We must now create the document we want to use. We could also open up a document if we had an existing one.

Microsoft Word

Off of the $WordApplication variable we can create a new document. It is best to save this new Document into another variable for future use.

 PS C:\WINDOWS\system32> $Document = $WordApplication.Documents.Add()

At this point you now have a document to work with.

Writing the Document

By this point I spent a lot of time trying and failing. Since there was both a variable for the Application ( $WordApplication ) and the Document ( $Document ), it was difficult determining which one I needed to writing from my sources. I first assumed it was the Document variable because that is what I was editing, but all of the interfaces were through the ‘Selection’ interface on the Application variable.

PS C:\WINDOWS\system32> $Writer = $WordApplication.Selection

There are several things I wanted to do inside of this document. First I wanted to bring content in from various files, but I also wanted to create a heading for each file that was written. The first part seemed easy, because there exists a method called .TypeText() . I found that you can insert text by writing a string or calling another cmdlet like Get-Content inside of this method. Also I found that you could set the style of the document before you ‘typed’ and then change it back to the normal format. The most confusing part was the inconsistency in the methods. When calling $Writer.TypeText((Get-Content -path $blah)) you have parentheses around the value, but when changing the style you assigned a value like this $Writer.Style = 'Heading 1'.

In the end, I wrote a foreach loop in the following way to import all of the text while setting the style on import.

 PS C:\WINDOWS\system32> 1..10 | foreach -Process {
>> $labpath = "C:\PragmaticWorks\Introduction to Windows PowerShell\Module$_\Lab-Solution.ps1"
>> $Writer.Style = 'Heading 1'
>> $Writer.TypeText("Module $_ Lab Solution")
>> $Writer.TypeParagraph()
>> $Writer.Style = 'Normal'
>> $Writer.TypeText((Get-Content $labpath))
>> $Writer.TypeParagraph()
>> }

note: .TypeParagraph() inserts a new line so that I would have a break between text.

When I finished inserting all of my files I saved the file and closed the document.

PS C:\WINDOWS\system32> $Document.SaveAs($filepath)
PS C:\WINDOWS\system32> $Document.Close()

Conclusion

This was a fun conceptual project to see if it was possible, and it did end up saving me the pain of copy and pasting all 20 documents into 2 separate documents. I did have some issues with formatting, which I would have loved to explore further, but I am not sure I could have automated this formatting as I had code examples in my source documents. I ended up manually aligning these code examples once in word. I did notice a .Find() interface on the Application variable and this would be neat to explore deeper for gathering information from Word Documents on distributed systems.

An MVP I follow suggested that I make this into a module. What are your thoughts? Would you like to see a PowerShell module for interfacing with documents? Feel free to tweet me @joshCorr for more info or if you benefited from this blog.