RStudio crashes when reading mzML file, R terminal doesn't

3

I've got a bunch of data in the mass-spec mzML file format. Using the latest version of R (v3.3.2), and the latest daily of RStudio (v1.1.47), reading in an mzML file crashes R in RStudio, but not R in the terminal.

library(mzR)
library(msdata)

mzxml <- system.file("threonine/threonine_i2_e35_pH_tree.mzXML",
                     package = "msdata")
aa <- openMSfile(mzxml) # this works

mzml <- system.file("microtofq/MM8.mzML", package = "msdata")

bb <- openMSfile(mzml) # this crashes R, but only in RStudio

sessionInfo()

> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] msdata_0.14.0 mzR_2.8.1     Rcpp_0.12.9  

loaded via a namespace (and not attached):
[1] ProtGenerics_1.6.0  parallel_3.3.2      Biobase_2.34.0     
[4] codetools_0.2-15    BiocGenerics_0.20.0

Update

Running with lldb attached (by the way, make sure to run that as root!), gives the following stacktrace:

error: mzR.so 0x010c47ab: DW_TAG_member '_M_local_buf' refers to type 0x0110cd75 which extends beyond the bounds of 0x010c47a3
error: mzR.so 0x00efe9cd: DW_TAG_member '_M_local_buf' refers to type 0x00f369c9 which extends beyond the bounds of 0x00efe9c5
error: mzR.so 0x000000cc: DW_TAG_member '_M_local_buf' refers to type 0x0000a52f which extends beyond the bounds of 0x000000c4
* thread #1: tid = 3799, 0x0000000000da88cd rsession`boost::filesystem::path::filename() const + 189, name = 'rsession', stop reason = signal SIGSEGV: invalid address (fault address: 0x83e6de7)
  * frame #0: 0x0000000000da88cd rsession`boost::filesystem::path::filename() const + 189
    frame #1: 0x00007f2b01ed4aaf mzR.so`pwiz::msdata::IO::HandlerMSData::startElement(this=0x00007ffcdab7b9f0, name=<unavailable>, attributes=<unavailable>, position=<unavailable>) + 511 at IO.cpp:2666
    frame #2: 0x00007f2b01fd1593 mzR.so`pwiz::minimxml::SAXParser::(anonymous namespace)::HandlerWrangler::startElement(this=0x00007ffcdab7b5b0, name="mzML", attributes=0x00007ffcdab7b558, position=45) const + 147 at SAXParser.cpp:211
    frame #3: 0x00007f2b01fd2cfa mzR.so`pwiz::minimxml::SAXParser::parse(is=0x00000000066528c0, handler=0x00007ffcdab7b9f0) + 2810 at SAXParser.cpp:531
    frame #4: 0x00007f2b01ec1927 mzR.so`pwiz::msdata::IO::read(is=0x00000000066528c0, msd=0x0000000004acc510, spectrumListFlag=IgnoreSpectrumList) + 3671 at IO.cpp:2766
    frame #5: 0x00007f2b01e5747b mzR.so`pwiz::msdata::Serializer_mzML::Impl::read(this=0x000000000501ff30, is=shared_ptr<std::basic_istream<char, std::char_traits<char> > > @ 0x00007ffcdab7c8e0, msd=0x0000000004acc510) const + 107 at Serializer_mzML.cpp:223
    frame #6: 0x00007f2b01e57b2a mzR.so`pwiz::msdata::Serializer_mzML::read(this=<unavailable>, is=<unavailable>, msd=<unavailable>) const + 58 at Serializer_mzML.cpp:250
    frame #7: 0x00007f2b01e43964 mzR.so`pwiz::msdata::Reader_mzML::read(this=<unavailable>, filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", head=<unavailable>, result=0x0000000004acc510, runIndex=<unavailable>, config=<unavailable>) const + 948 at DefaultReaderList.cpp:148
    frame #8: 0x00007f2b01e5d855 mzR.so`pwiz::msdata::ReaderList::read(this=0x00000000058e1170, filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", head="<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>\r\n<mzML xmlns=\"http://psi.hupo.org/ms/mzml\" xsi:schemaLocation=\"http://psi.hupo.org/ms/mzml http://psidev.info/files/ms/mzML/xsd/mzML1.1.0.xsd\" version=\"1.1.0\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\">\r\n<cvList count=\"2\">\r\n<cv id=\"MS\" fullName=\"Proteomics Standards Initiative Mass Spectrometry Vocabularies\" version=\"2.26.0\" URI=\"http://psidev.cvs.sourceforge.net/*checkout*/psidev/psi/psi-ms/mzML/controlledVocabulary/psi-ms.obo\"/>\r\n<cv id=\"UO\" fullName=", result=0x0000000004acc510, sampleIndex=0, config=0x00007ffcdab7cb10) const + 181 at Reader.cpp:101
    frame #9: 0x00007f2b01eea60f mzR.so`pwiz::msdata::(anonymous namespace)::(filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", msd=0x0000000004acc510, reader=0x00000000058e1170, head="<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>\r\n<mzML xmlns=\"http://psi.hupo.org/ms/mzml\" xsi:schemaLocation=\"http://psi.hupo.org/ms/mzml http://psidev.info/files/ms/mzML/xsd/mzML1.1.0.xsd\" version=\"1.1.0\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\">\r\n<cvList count=\"2\">\r\n<cv id=\"MS\" fullName=\"Proteomics Standards Initiative Mass Spectrometry Vocabularies\" version=\"2.26.0\" URI=\"http://psidev.cvs.sourceforge.net/*checkout*/psidev/psi/psi-ms/mzML/controlledVocabulary/psi-ms.obo\"/>\r\n<cv id=\"UO\" fullName=")(const string &const, pwiz::msdata::MSData &const, const pwiz::msdata::Reader &const, const string &const) + 127 at MSDataFile.cpp:61
    frame #10: 0x00007f2b01eec0ba mzR.so`pwiz::msdata::MSDataFile::MSDataFile(this=0x0000000004acc510, filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", reader=<unavailable>, calculateSourceFileChecksum=<unavailable>) + 218 at MSDataFile.cpp:91
    frame #11: 0x00007f2b01ee434a mzR.so`pwiz::msdata::RAMPAdapter::RAMPAdapter(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) [inlined] pwiz::msdata::RAMPAdapter::Impl::Impl(filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", this=0x0000000004acc510) + 5 at RAMPAdapter.cpp:49
    frame #12: 0x00007f2b01ee4345 mzR.so`pwiz::msdata::RAMPAdapter::RAMPAdapter(this=0x00000000050b8110, filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML") + 37 at RAMPAdapter.cpp:296
    frame #13: 0x00007f2b01d3df7e mzR.so`rampOpenFile(filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML") + 542 at ramp.cpp:284
    frame #14: 0x00007f2b01d3cd15 mzR.so`cRamp::cRamp(this=0x00000000050e4ce0, fileName="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", declaredScansOnly=<unavailable>) + 149 at cramp.cpp:75
    frame #15: 0x00007f2b01d45af8 mzR.so`RcppRamp::open(this=0x0000000004fd0ec0, fileName="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", declaredScansOnly=<unavailable>) + 72 at RcppRamp.cpp:23
    frame #16: 0x00007f2b01d5a955 mzR.so`Rcpp::CppMethod2<RcppRamp, void, char const*, bool>::operator(this=0x00000000050c9990, object=0x0000000004fd0ec0, args=<unavailable>)(RcppRamp*, SEXPREC**) + 245 at Module_generated_CppMethod.h:215
    frame #17: 0x00007f2b01d570f0 mzR.so`Rcpp::class_<RcppRamp>::invoke_void(this=<unavailable>, method_xp=<unavailable>, object=0x0000000005cbe020, args=0x00007ffcdab7d2f0, nargs=<unavailable>) + 176 at class.h:212
    frame #18: 0x00007f2b027e3f41 Rcpp.so`CppMethod__invoke_void(args=<unavailable>) + 449 at Module.cpp:200
    frame #19: 0x00007f2b26a6e9b1 libR.so`do_External(call=0x000000000570f208, op=0x000000000333ac20, args=0x0000000005dd50b8, env=0x0000000005dd50f0) + 337 at dotcode.c:548
    frame #20: 0x00007f2b26aa86df libR.so`Rf_eval(e=0x000000000570f208, rho=0x0000000005dd50f0) + 1871 at eval.c:713
    frame #21: 0x00007f2b26aaadf8 libR.so`do_begin(call=0x000000000570f198, op=0x00000000033254d8, args=0x000000000570f358, rho=0x0000000005dd50f0) + 344 at eval.c:1807
    frame #22: 0x00007f2b26aa84d1 libR.so`Rf_eval(e=<unavailable>, rho=0x0000000005dd50f0) + 1345 at eval.c:685
    frame #23: 0x00007f2b26aa9d8d libR.so`Rf_applyClosure(call=<unavailable>, op=<unavailable>, arglist=<unavailable>, rho=<unavailable>, suppliedvars=0x000000000330a1c8) + 1309 at eval.c:1135
    frame #24: 0x00007f2b26aa82ad libR.so`Rf_eval(e=0x00000000062d0ad8, rho=0x000000000595b3e0) + 797 at eval.c:732
    frame #25: 0x00007f2b26aaadf8 libR.so`do_begin(call=0x00000000062d0d08, op=0x00000000033254d8, args=0x00000000062d0b10, rho=0x000000000595b3e0) + 344 at eval.c:1807
    frame #26: 0x00007f2b26aa84d1 libR.so`Rf_eval(e=<unavailable>, rho=0x000000000595b3e0) + 1345 at eval.c:685
    frame #27: 0x00007f2b26aa84d1 libR.so`Rf_eval(e=<unavailable>, rho=0x000000000595b3e0) + 1345 at eval.c:685
    frame #28: 0x00007f2b26aaadf8 libR.so`do_begin(call=0x00000000062d04a0, op=0x00000000033254d8, args=0x00000000062d0e90, rho=0x000000000595b3e0) + 344 at eval.c:1807
    frame #29: 0x00007f2b26aa84d1 libR.so`Rf_eval(e=<unavailable>, rho=0x000000000595b3e0) + 1345 at eval.c:685
    frame #30: 0x00007f2b26aa9d8d libR.so`Rf_applyClosure(call=<unavailable>, op=<unavailable>, arglist=<unavailable>, rho=<unavailable>, suppliedvars=0x000000000330a1c8) + 1309 at eval.c:1135
    frame #31: 0x00007f2b26aa82ad libR.so`Rf_eval(e=0x000000000595a9f0, rho=0x000000000334ff58) + 797 at eval.c:732
    frame #32: 0x00007f2b26aabf66 libR.so`do_set(call=0x000000000595b680, op=0x000000000330c2d8, args=<unavailable>, rho=0x000000000334ff58) + 166 at eval.c:2197
    frame #33: 0x00007f2b26aa84d1 libR.so`Rf_eval(e=<unavailable>, rho=0x000000000334ff58) + 1345 at eval.c:685
    frame #34: 0x00007f2b26acf932 libR.so`Rf_ReplIteration(rho=0x000000000334ff58, savestack=<unavailable>, browselevel=<unavailable>, state=0x00007ffcdab7f020) + 546 at main.c:258
    frame #35: 0x00007f2b26acfca1 libR.so`R_ReplConsole(rho=0x000000000334ff58, savestack=0, browselevel=0) + 129 at main.c:308
    frame #36: 0x00007f2b26acfd58 libR.so`run_Rmainloop + 72 at main.c:1059
    frame #37: 0x0000000000d85dc2 rsession`rstudio::r::session::runEmbeddedR(rstudio::core::FilePath const&, rstudio::core::FilePath const&, bool, bool, SA_TYPE, rstudio::r::session::Callbacks const&, rstudio::r::session::InternalCallbacks*) + 434
    frame #38: 0x0000000000d6791d rsession`rstudio::r::session::run(rstudio::r::session::ROptions const&, rstudio::r::session::RCallbacks const&) + 9581
    frame #39: 0x00000000006a7299 rsession`main + 10809
    frame #40: 0x00007f2b2530e830 libc.so.6`__libc_start_main(main=(rsession`main), argc=11, argv=0x00007ffcdab81fd8, init=<unavailable>, fini=<unavailable>, rtld_fini=<unavailable>, stack_end=0x00007ffcdab81fc8) + 240 at libc-start.c:291
    frame #41: 0x00000000006b4fe1 rsession`_start + 41

Which definitely looks like a memory problem on the Proteowizard side, that somehow just gets ignored in the base R process.

r
rstudio
asked on Stack Overflow Jan 19, 2017 by rmflight • edited Jan 20, 2017 by rmflight

1 Answer

3

When running this with a version of R compiled with sanitizers, I see:

> source("msdata.R", echo = TRUE)

> library(mzR)
Loading required package: Rcpp

> library(msdata)

> mzxml <- system.file("threonine/threonine_i2_e35_pH_tree.mzXML",
+                      package = "msdata")

> aa <- openMSfile(mzxml) # this works
ramp.cpp:1197:34: runtime error: index -1 out of bounds for type 'char [513]'
SUMMARY: AddressSanitizer: undefined-behavior ramp.cpp:1197:34 in

> mzml <- system.file("microtofq/MM8.mzML", package = "msdata")

> bb <- openMSfile(mzml) # this crashes R, but only in RStudio

In particular, this bit:

> aa <- openMSfile(mzxml) # this works
ramp.cpp:1197:34: runtime error: index -1 out of bounds for type 'char [513]'
SUMMARY: AddressSanitizer: undefined-behavior ramp.cpp:1197:34 in

implies there may be something going wrong with the openMSfile() function -- it appears to be attempting to read data at an invalid offset. I would file this issue with the mzR maintainers.


> sessionInfo()
R Under development (unstable) (2017-01-17 r72002)
Platform: x86_64-apple-darwin16.3.0 (64-bit)
Running under: macOS Sierra 10.12.2

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] msdata_0.15.0        mzR_2.9.3            Rcpp_0.12.9
[4] testthat_1.0.2       rmarkdown_1.3        knitr_1.15.1
[7] roxygen2_5.0.1       devtools_1.12.0.9000

loaded via a namespace (and not attached):
 [1] magrittr_1.5        BiocGenerics_0.21.3 pkgload_0.0.0.9000
 [4] R6_2.2.0            stringr_1.1.0       tools_3.4.0
 [7] pkgbuild_0.0.0.9000 parallel_3.4.0      Biobase_2.35.0
[10] withr_1.0.2         htmltools_0.3.5     ProtGenerics_1.7.0
[13] rprojroot_1.2       digest_0.6.11       crayon_1.3.2
[16] codetools_0.2-15    memoise_1.0.0       evaluate_0.10
[19] stringi_1.1.2       compiler_3.4.0      backports_1.0.5

EDIT: Here's the configure call used to build R; the same compiler + flags are used for all built packages as well. (Extracted from R.home("etc/Makeconf"))

# R was configured using the following call
# (not including env. vars and site configuration)
# configure  '--with-blas=-L/usr/local/opt/openblas/lib -lopenblas' '--with-lapack=-L/usr/local/opt/lapack/lib -llapack' '--with-cairo' '--disable-R-framework' '--enable-R-shlib' '--with-readline' '--enable-R-profiling' '--enable-memory-profiling' '--with-valgrind-instrumentation=2' '--without-internal-tzcode' '--prefix=/Users/kevin/r/r-devel-sanitizers' 'PKG_CONFIG_PATH=/opt/X11/lib/pkgconfig'

CC = clang-3.9 -std=gnu99 -fsanitize=address,undefined -fno-omit-frame-pointer -fno-sanitize=float-divide-by-zero
CXX = clang++-3.9 -fsanitize=address,undefined -fno-omit-frame-pointer -fno-sanitize=float-divide-by-zero 

EDIT V2: Given the lldb stack trace, it looks like the culprit may indeed be a clash between Boost versions (the bundled version used by RStudio vs. the version used by mzR). Note that mzR is now inadvertently calling the Boost routines in the rsession executable, when it likely intends to call its own bundled version.

answered on Stack Overflow Jan 19, 2017 by Kevin Ushey • edited Jan 21, 2017 by Kevin Ushey

User contributions licensed under CC BY-SA 3.0