I've written an OCR wrapper library around the Microsoft Office Document Imaging COM API, and in a Console App running locally, it works flawlessly, with every test.
Sadly, things start going badly when we attempt to integrate it with a WCF service running as an ASP.Net Web Application, under IIS6. We had issues around trying to free up the MODI COM Objects, and there were plenty of examples on the web that helped us.
However, problems still remain. If I restart IIS, and do a fresh deployment of the web app, the first few OCR attempts work great. If I leave it for 30 minutes or so, and then do another request, I get server failure errors like this:
The server threw an exception. (Exception from HRESULT: 0x80010105 (RPC_E_SERVERFAULT)): at MODI.DocumentClass.Create(String FileOpen)
From this point on, every request will fail to do the OCR, until I reset IIS, and the cycle begins again.
We run this application in it's own App Pool, and it runs under an identity with Local Admin rights.
UPDATE: This issue can be solved by doing the OCR stuff out of process. It appears as though the MODI library doesn't play well with managed code, when it comes to cleaning up after itself, so spawning new processes for each OCR request worked well in my situation.
Here is the function that performs the OCR:
public class ImageReader : IDisposable
{
private MODI.Document _document;
private MODI.Images _images;
private MODI.Image _image;
private MODI.Layout _layout;
private ManualResetEvent _completedOCR = new ManualResetEvent(false);
// SNIP - Code removed for clarity
private string PerformMODI(string fileName)
{
_document = new MODI.Document();
_document.OnOCRProgress += new MODI._IDocumentEvents_OnOCRProgressEventHandler(_document_OnOCRProgress);
_document.Create(fileName);
_document.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, true, true);
_completedOCR.WaitOne(5000);
_document.Save();
_images = _document.Images;
_image = (MODI.Image)_images[0];
_layout = _image.Layout;
string text = _layout.Text;
_document.Close(false);
return text;
}
void _document_OnOCRProgress(int Progress, ref bool Cancel)
{
if (Progress == 100)
{
_completedOCR.Set();
}
}
private static void SetComObjectToNull(params object[] objects)
{
for (int i = 0; i < objects.Length; i++)
{
object o = objects[i];
if (o != null)
{
Marshal.FinalReleaseComObject(o);
o = null;
}
}
}
[MethodImpl(MethodImplOptions.NoInlining)]
public void Dispose()
{
SetComObjectToNull(_layout, _image, _images, _document);
GC.Collect();
GC.WaitForPendingFinalizers();
}
}
I then instantiate an instance of ImageReader inside a using block (which will call IDisposable.Dispose on exit)
Calling Marshal.FinalReleaseComObject should instruct the CLR to release the COM objects, and so I'm at a loss to figure out what would be causing the symptoms we have.
For what it's worth, running this code outside of IIS, in say a Console App, everything seems bullet proof. It works every time.
Any tips that help me diagnose and solve this issue would be an immense help and I'll upvote like crazy! ;-)
Thanks!
Have you thought of hosting the OCR portion of your app out-of-process.
Having a service can give you tons of flexibility:
Personally I have found in the past the COM interop + IIS = grief.
MODI is incredibly wonky when it comes to getting rid of itself, especially running in IIS. In my experience, I've found that although it slows everything down, the only way to get rid of these errors is to add a GC.WaitForPendingFinalizers() after your GC.Collect() call. If you're interested, I wrote an article about this.
Can you replicate the problem in a small console application? Perhaps leaving it sleep for 30 mins and coming back to it?
Best way to solve things like this is to isolate it down totally. I'd be interested to see how that works.
I had to deal with this error a week ago, and after testing some solutions giving here, i finally resolved the problem. I'll explain here how i did it.
In my case i have a windows service runing and processing documents from a folder, the problem occurs when there are more than 20 documents, throwing the error: Exception from HRESULT: 0x80010105 (RPC_E_SERVERFAULT).
In my code i was calling a method each time i detect a document in the folder, i make an instance of MODI document (MODI.Document _document = new MODI.Document();) and i process the file, and that was what causes the error!!
The solution was to have just one global instance of MODI.Document, and process all documents whit it, this way i have only one instance runing for my service all time.
I hope that will help those who are facing the same problem.
User contributions licensed under CC BY-SA 3.0