I am moving to a new Azure VM and all of a sudden getting crashes and errors in crazy places I never have before. (The new VM is a switch from Windows Server 2016 to 2019 but that may be a complete red herring.) I've tracked down one spot where I can reproduce the problem with the following code
# load packages library(foreach) library(randomForest) library(iterators) library(parallel) library(doParallel) numCores <- detectCores() - 1 ntrees <- 8000 treeSubs <- ntrees/numCores # initialize cl <- makeCluster(numCores) registerDoParallel(cl) # dummy datasets x <- as.data.frame(matrix(runif(100000), 20000)) y <- gl(2, 10000) parRf <- foreach(ntree = rep(treeSubs,numCores), .combine = randomForest::combine, .packages = 'randomForest', .multicombine = TRUE) %dopar% randomForest(x=x, y=y, importance=TRUE,mtry=2,ntree = ntree, replace = TRUE ) z <- matrix(runif(1000), 200) pred <- predict(parRf, z, type = "prob")
Notice it is the predict step that causes the failure, but when I make the randomForest call not in parallel, the predict step works fine. Or if I make the data sets smaller, it also works. In RStudio I get the grey "bomb" and in RGui it just disappears.
Here are some details of the crash report from the Windows Event Log:
Faulting application name: rsession.exe, version: 1.1.463.0, time stamp: 0x5bd11fb5 Faulting module name: randomForest.dll, version: 0.0.0.0, time stamp: 0x609f54bd Exception code: 0xc0000005 Fault offset: 0x0000000000001b42 Faulting process id: 0x1e48 Faulting application start time: 0x01d752f21b6d7a79 Faulting application path: C:\Program Files\RStudio\bin\x64\rsession.exe
I wonder if possibly this is related to this question: R Crashes when training using caret and method = gamLoess But I don't see any solution...
Here's session info:
> sessionInfo() R version 4.0.5 (2021-03-31) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows Server >= 2012 x64 (build 9200) Matrix products: default locale:  LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252  LC_MONETARY=English_United States.1252 LC_NUMERIC=C  LC_TIME=English_United States.1252 attached base packages:  parallel stats graphics grDevices utils datasets methods base other attached packages:  doParallel_1.0.16 iterators_1.0.13 randomForest_4.6-14 foreach_1.5.1 loaded via a namespace (and not attached):  compiler_4.0.5 tools_4.0.5 codetools_0.2-18 >
Thanks in advance for any tips.
the code works in parallel. Try running the code within a project space...(create a new project and run it within that) and check. (I have received the error on other memory centric codes when run outside the project space.)
head(pred) 1 2 1 0.553750 0.446250 2 0.533750 0.466250 3 0.367750 0.632250 4 0.578625 0.421375 5 0.487125 0.512875 6 0.423375 0.576625
User contributions licensed under CC BY-SA 3.0