Why parallelized code will not write in an Excel spreadsheet?

2

Writing many worksheets in an Excel spreadsheet can take a while. Parallelizing it would be helpful.

This code works well, it makes an Excel spreadsheet pop on the screen with four worksheets named Sheet1,1, 2, and 3.

open Microsoft.Office.Interop.Excel
open FSharp.Collections.ParallelSeq

let backtestWorksheets = [1..3]

let app = new ApplicationClass(Visible = true) 

let workbook = app.Workbooks.Add(XlWBATemplate.xlWBATWorksheet)

let writeInfoSheet (worksheet: Worksheet) : unit =

    let foo i =
        let si = string i
        worksheet.Range("A" + si, "A" + si).Value2 <- "Hello " + si
    List.iter foo [1..10]

let wfm = [1, writeInfoSheet; 2, writeInfoSheet; 3, writeInfoSheet]
          |> Map.ofList

let adder (workbook : Workbook)
          (i        : int)
                    : unit =

    let sheet = workbook.Worksheets.Add() :?> Worksheet
    sheet.Name <- string i
    wfm.[i] sheet

List.iter (adder workbook) backtestWorksheets
//PSeq.iter (adder workbook) backtestWorksheets

[<EntryPoint>]
let main argv = 
    printfn "%A" argv
    0 // return an integer exit code

However, replacing the line starting with List.iter with the commented line just below it makes a spreadsheet with the same four worksheets pop up, but all worksheets are blank.

So my question is: Why can't code parallelized with PSeq write to Excel?

Remark:

Originally I had a different problem. Maybe because in my application the worksheets are heavier when I try to run code similar to the above with PSeq there is an exception that says

Unhandled Exception: System.TypeInitializationException: The type initializer for '<StartupCode$Fractal13>.$Program' threw an exception. ---> System.AggregateException: One or more errors occurred. ---> System.Runtime.InteropServices.COMException: The message filter indicated that the application is busy. (Exception from HRESULT: 0x8001010A (RPC_E_SERVERCALL_RETRYLATER))

This does not happen with List.iter replacing PSeq.iter.

I was not able to replicate this exception in a simple enough context to be a proper SO question, but I would still be interested in any suggestions for dealing with it.

excel
parallel-processing
f#
excel-interop
asked on Stack Overflow May 19, 2018 by Soldalma • edited May 19, 2018 by Soldalma

1 Answer

5

It looks like the Microsoft.Office.Interop.Excel code was never designed to be called from multiple threads at once. Here's a question someone asked in the MS Office forums about doing an update in multiple threads (in C#). I'll quote the relevant parts of that answer here:

Using multi-threading to search in multiple worksheets ends up with using the heart of Excel – the Excel.Application object, which means threads need to be queued to run one-at a time, depriving you of the desired performance improvement for the application.

[...]

All of this is because the Office object model isn't thread safe.

It looks like you're stuck with using a non-parallel design if you're calling anything in the Microsoft.Office.Interop namespace.

Edit: Aaron M. Eshbach had a great suggestion in the comments: do all the background work on multiple threads, and use a MailboxProcessor to do the actual updates to the spreadsheet. The MailboxProcessor's message queue will automatically serialize the update operations for you, with no extra work required on your part.

answered on Stack Overflow May 19, 2018 by rmunn • edited Jun 20, 2020 by Community

User contributions licensed under CC BY-SA 3.0