Monday, April 14, 2008

Breaking Apart Excel Files with Beyond Compare

I'm working on a SharePoint project using Excel Services to host an Excel file our client's sales reps use to create a total cost of ownership sales report for large equipment.  Excel Services is perfect for this except for one detail: You can't have images in the workbook you're serving up with ES. 

Think sales reps and sales reports.  They need images.  Bummer.

No worries, because the new XML format of Office documents lets us crack open those files and programatically inject new content or edit existing content.  The changes aren't overly simple, but they're workable using a combination of the System.IO.Packaging class and some manual edits of XML files.

Before you start cutting code, you need to figure out what you'll need to edit.  Two tools come in handy for this: Beyond Compare 2 and HTML Tidy.  Beyond Compare does an amazing job of directory comparison, and Tidy lets you quickly format XML files into an indented, readable view.

You'll also need before and after versions of the Excel file.  We cleaned out images from the workbook we were hosting in ES, uploaded it, then ran it through our system we've built, and finally saved a snapshot of it.  Then we opened that imageless snapshot, added in our images, and saved that file as a separate file.  Rename both with a .zip extension, extract them to separate folders, open those folders with Beyond Compare, and Poof! you've got an exact view of what you need to update in your files.  (Note the blue and red ".bak" files on the left are my additions as I'm working on coding up the changes.)

You can use Tidy to help clean up files on both sides of the viewer for easy readability.  Just run "tidy -im -xml <sourcefile>" to get a nicely formatted file.  "-xml" treats the input file as XML, and "-im" indents the XML and updates the source file rather than outputting to a separate file (which you can do).

Now when you do a comparison on files you'll get something like this showing the exact changes:

Now it's just a simple <koff, koff> matter of coding up the edits and additions to those XML files.  More on that later.

Now Playing: Mike Farris -- Goodnight Sun.  I've been all over Farris's "Salvation In Lights" for its great vocals and music.  This earlier album by him is full of amazing tracks, to the point where I'm starting to get concerned I'm wearing out the tracks and sectors where its located on my iPod...

No comments:

Subscribe (RSS)

The Leadership Journey