Writing many worksheets in an Excel
spreadsheet can take a while. Parallelizing it would be helpful.
This code works well, it makes an Excel
spreadsheet pop on the screen with four worksheets named Sheet1
,1
, 2
, and 3
.
open Microsoft.Office.Interop.Excel
open FSharp.Collections.ParallelSeq
let backtestWorksheets = [1..3]
let app = new ApplicationClass(Visible = true)
let workbook = app.Workbooks.Add(XlWBATemplate.xlWBATWorksheet)
let writeInfoSheet (worksheet: Worksheet) : unit =
let foo i =
let si = string i
worksheet.Range("A" + si, "A" + si).Value2 <- "Hello " + si
List.iter foo [1..10]
let wfm = [1, writeInfoSheet; 2, writeInfoSheet; 3, writeInfoSheet]
|> Map.ofList
let adder (workbook : Workbook)
(i : int)
: unit =
let sheet = workbook.Worksheets.Add() :?> Worksheet
sheet.Name <- string i
wfm.[i] sheet
List.iter (adder workbook) backtestWorksheets
//PSeq.iter (adder workbook) backtestWorksheets
[<EntryPoint>]
let main argv =
printfn "%A" argv
0 // return an integer exit code
However, replacing the line starting with List.iter
with the commented line just below it makes a spreadsheet with the same four worksheets pop up, but all worksheets are blank.
So my question is: Why can't code parallelized with PSeq write to Excel?
Remark:
Originally I had a different problem. Maybe because in my application the worksheets are heavier when I try to run code similar to the above with PSeq
there is an exception that says
Unhandled Exception: System.TypeInitializationException: The type initializer for '<StartupCode$Fractal13>.$Program' threw an exception. ---> System.AggregateException: One or more errors occurred. ---> System.Runtime.InteropServices.COMException: The message filter indicated that the application is busy. (Exception from HRESULT: 0x8001010A (RPC_E_SERVERCALL_RETRYLATER))
This does not happen with List.iter
replacing PSeq.iter
.
I was not able to replicate this exception in a simple enough context to be a proper SO question, but I would still be interested in any suggestions for dealing with it.
It looks like the Microsoft.Office.Interop.Excel
code was never designed to be called from multiple threads at once. Here's a question someone asked in the MS Office forums about doing an update in multiple threads (in C#). I'll quote the relevant parts of that answer here:
Using multi-threading to search in multiple worksheets ends up with using the heart of Excel – the Excel.Application object, which means threads need to be queued to run one-at a time, depriving you of the desired performance improvement for the application.
[...]
All of this is because the Office object model isn't thread safe.
It looks like you're stuck with using a non-parallel design if you're calling anything in the Microsoft.Office.Interop
namespace.
Edit: Aaron M. Eshbach had a great suggestion in the comments: do all the background work on multiple threads, and use a MailboxProcessor
to do the actual updates to the spreadsheet. The MailboxProcessor's message queue will automatically serialize the update operations for you, with no extra work required on your part.
User contributions licensed under CC BY-SA 3.0