CodeSnippet: Scanning with WIA & OCR with Office
Scanners are perhaps becoming less and less used as we all move towards a paperless environment with digitals bills and digital photos, but a lot of "official" documentation must always be presented with the original physical copy. The wife and I have a huge collection of such documents which have been horribly archived in several filing cabinets – finding the right document is often a nightmare, let alone using any of the information in them. To combat this, I’ve started on a document manager which (aims to) make it a smooth process from scanning, archiving, searching, tagging, etc all in one.
I thought I’d share the scanning code section, because while scanning wasn’t the easiest thing to dig up, it still wasn’t half as daunting as I thought it would be. Further down you’ll also see code to do optical character recognition (OCR) – that is converting a scanned text document into usable text.
TWAIN or WIA?
In years past, TWAIN (commonly known as Technology Without An Interesting Name) was the way to perform scanning, so that was my starting point on my dig for scanning code. However, I came across a mention of Windows Image Acquisition in my scanners drivers.
"In comparison to TWAIN, WIA is said to be more flexible, because it is a standardized interface that doesn’t require a tight bundling of scanner software and driver (TWAIN-only scanners are often limited to functions that are enabled in its driver-software-bundle). Most recent scanners support WIA."
(source: Wikipedia)
Lift the image from a physical file into memory
To access WIA in .NET, you’ll need to add a reference to the COM library, "Microsoft Windows Image Acquisition Library v2.0" (wiaaut.dll).
using WIA;
const string wiaFormatJPEG = "{B96B3CAE-0728-11D3-9D7B-0000F81EF32E}"; CommonDialogClass wiaDiag = new CommonDialogClass(); WIA.ImageFile wiaImage = null; wiaImage = wiaDiag.ShowAcquireImage( WiaDeviceType.UnspecifiedDeviceType, WiaImageIntent.GrayscaleIntent, WiaImageBias.MaximizeQuality, wiaFormatJPEG, true, true, false);
This isn’t the only way to perform a WIA scan, but it is a very quick and easy way to do it. A standard dialog will appear asking you to select the scanner, and then another dialog will give you a few options (such a previewing, image quality, image mode, which source – by that I mean a document feeder or flatbed depending on your scanners capabilities).
Doing something with the scan
Once the document is scanned in you have to figure out what you want to do with it. If it was a photo for example, simply saving it to JPEG may be suffice, but if it is document you may want to perform OCR to extract the text first.
Saving to JPEG
Saving to JPEG is very easy, wiaImage.SaveFile("temp.jpg");
If you wanted to do more with the JPEG (resize, colour change, filtering, or simply just display in your app) but still wish to treat it as an image, you need to get the image data into either a System.Drawing.Image (WinForms) or System.Windows.Controls.Image (WPF).
WIA.Vector vector = wiaImage.FileData; Image image = new Image(); image.Source = BitmapFrame.Create(new MemoryStream((byte[])vector.get_BinaryData()));
OCR
OCR is by no means a small task to undertake by yourself, but luckily Office (2003 and up) includes an API named Microsoft Office Document Imaging (MODI) you can program against to let Office process the image for you. Unfortunately, MODI isn’t installed by default with Office 2007 but it is easy enough to add by running the Office setup again, selecting "Add or Remove Features", and then selecting "Scanning, OCR and Indexing Service Filter"
All Office OCR operations must go through the COM library, "Microsoft Office Document Imaging 12.0 Type Library" (that’s for Office 2007) (MDIVWCTL.DLL) so add a reference to it.
MODI.Document mDoc = new MODI.Document(); mDoc.Create("temp.jpg"); mDoc.OCR(MODI.MiLANGUAGES.miLANG_SYSDEFAULT, true, true); MODI.Image mImage = (MODI.Image)mDoc.Images[0];MessageBox.Show(mImage.Layout.Text);
The key thing to note is that MODI.Document only accept input files from an string representing the filepath, so you have to save the scanned document to disk (in either JPEG or TIFF) rather than reading from memory.
As far as I can tell, this doesn’t retain any layout data (ie, BlockA of text was situation at X,Y) so it doesn’t appear to be the perfect solution for my document manager but it is free (well, so long as you’ve got Office) so I’ll overlook that for now.
1 Comment



