I am moving to a new Azure VM and all of a sudden getting crashes and errors in crazy places I never have before. (The new VM is a switch from Windows Server 2016 to 2019 but that may be a complete red herring.) I've tracked down one spot where I can reproduce the problem with the following code
# load packages
library(foreach)
library(randomForest)
library(iterators)
library(parallel)
library(doParallel)
numCores <- detectCores() - 1
ntrees <- 8000
treeSubs <- ntrees/numCores
# initialize
cl <- makeCluster(numCores)
registerDoParallel(cl)
# dummy datasets
x <- as.data.frame(matrix(runif(100000), 20000))
y <- gl(2, 10000)
parRf <- foreach(ntree = rep(treeSubs,numCores), .combine = randomForest::combine,
.packages = 'randomForest', .multicombine = TRUE) %dopar%
randomForest(x=x, y=y,
importance=TRUE,mtry=2,ntree = ntree,
replace = TRUE
)
z <- matrix(runif(1000), 200)
pred <- predict(parRf, z, type = "prob")
Notice it is the predict step that causes the failure, but when I make the randomForest call not in parallel, the predict step works fine. Or if I make the data sets smaller, it also works. In RStudio I get the grey "bomb" and in RGui it just disappears.
Here are some details of the crash report from the Windows Event Log:
Faulting application name: rsession.exe, version: 1.1.463.0, time stamp: 0x5bd11fb5
Faulting module name: randomForest.dll, version: 0.0.0.0, time stamp: 0x609f54bd
Exception code: 0xc0000005
Fault offset: 0x0000000000001b42
Faulting process id: 0x1e48
Faulting application start time: 0x01d752f21b6d7a79
Faulting application path: C:\Program Files\RStudio\bin\x64\rsession.exe
I wonder if possibly this is related to this question: R Crashes when training using caret and method = gamLoess But I don't see any solution...
Here's session info:
> sessionInfo()
R version 4.0.5 (2021-03-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server >= 2012 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] doParallel_1.0.16 iterators_1.0.13 randomForest_4.6-14 foreach_1.5.1
loaded via a namespace (and not attached):
[1] compiler_4.0.5 tools_4.0.5 codetools_0.2-18
>
Thanks in advance for any tips.
the code works in parallel. Try running the code within a project space...(create a new project and run it within that) and check. (I have received the error on other memory centric codes when run outside the project space.)
head(pred) 1 2 1 0.553750 0.446250 2 0.533750 0.466250 3 0.367750 0.632250 4 0.578625 0.421375 5 0.487125 0.512875 6 0.423375 0.576625
User contributions licensed under CC BY-SA 3.0