I've got a bunch of data in the mass-spec mzML file format. Using the latest version of R (v3.3.2), and the latest daily of RStudio (v1.1.47), reading in an mzML file crashes R in RStudio, but not R in the terminal.
library(mzR)
library(msdata)
mzxml <- system.file("threonine/threonine_i2_e35_pH_tree.mzXML",
package = "msdata")
aa <- openMSfile(mzxml) # this works
mzml <- system.file("microtofq/MM8.mzML", package = "msdata")
bb <- openMSfile(mzml) # this crashes R, but only in RStudio
sessionInfo()
> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] msdata_0.14.0 mzR_2.8.1 Rcpp_0.12.9
loaded via a namespace (and not attached):
[1] ProtGenerics_1.6.0 parallel_3.3.2 Biobase_2.34.0
[4] codetools_0.2-15 BiocGenerics_0.20.0
Update
Running with lldb
attached (by the way, make sure to run that as root!), gives the following stacktrace:
error: mzR.so 0x010c47ab: DW_TAG_member '_M_local_buf' refers to type 0x0110cd75 which extends beyond the bounds of 0x010c47a3
error: mzR.so 0x00efe9cd: DW_TAG_member '_M_local_buf' refers to type 0x00f369c9 which extends beyond the bounds of 0x00efe9c5
error: mzR.so 0x000000cc: DW_TAG_member '_M_local_buf' refers to type 0x0000a52f which extends beyond the bounds of 0x000000c4
* thread #1: tid = 3799, 0x0000000000da88cd rsession`boost::filesystem::path::filename() const + 189, name = 'rsession', stop reason = signal SIGSEGV: invalid address (fault address: 0x83e6de7)
* frame #0: 0x0000000000da88cd rsession`boost::filesystem::path::filename() const + 189
frame #1: 0x00007f2b01ed4aaf mzR.so`pwiz::msdata::IO::HandlerMSData::startElement(this=0x00007ffcdab7b9f0, name=<unavailable>, attributes=<unavailable>, position=<unavailable>) + 511 at IO.cpp:2666
frame #2: 0x00007f2b01fd1593 mzR.so`pwiz::minimxml::SAXParser::(anonymous namespace)::HandlerWrangler::startElement(this=0x00007ffcdab7b5b0, name="mzML", attributes=0x00007ffcdab7b558, position=45) const + 147 at SAXParser.cpp:211
frame #3: 0x00007f2b01fd2cfa mzR.so`pwiz::minimxml::SAXParser::parse(is=0x00000000066528c0, handler=0x00007ffcdab7b9f0) + 2810 at SAXParser.cpp:531
frame #4: 0x00007f2b01ec1927 mzR.so`pwiz::msdata::IO::read(is=0x00000000066528c0, msd=0x0000000004acc510, spectrumListFlag=IgnoreSpectrumList) + 3671 at IO.cpp:2766
frame #5: 0x00007f2b01e5747b mzR.so`pwiz::msdata::Serializer_mzML::Impl::read(this=0x000000000501ff30, is=shared_ptr<std::basic_istream<char, std::char_traits<char> > > @ 0x00007ffcdab7c8e0, msd=0x0000000004acc510) const + 107 at Serializer_mzML.cpp:223
frame #6: 0x00007f2b01e57b2a mzR.so`pwiz::msdata::Serializer_mzML::read(this=<unavailable>, is=<unavailable>, msd=<unavailable>) const + 58 at Serializer_mzML.cpp:250
frame #7: 0x00007f2b01e43964 mzR.so`pwiz::msdata::Reader_mzML::read(this=<unavailable>, filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", head=<unavailable>, result=0x0000000004acc510, runIndex=<unavailable>, config=<unavailable>) const + 948 at DefaultReaderList.cpp:148
frame #8: 0x00007f2b01e5d855 mzR.so`pwiz::msdata::ReaderList::read(this=0x00000000058e1170, filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", head="<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>\r\n<mzML xmlns=\"http://psi.hupo.org/ms/mzml\" xsi:schemaLocation=\"http://psi.hupo.org/ms/mzml http://psidev.info/files/ms/mzML/xsd/mzML1.1.0.xsd\" version=\"1.1.0\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\">\r\n<cvList count=\"2\">\r\n<cv id=\"MS\" fullName=\"Proteomics Standards Initiative Mass Spectrometry Vocabularies\" version=\"2.26.0\" URI=\"http://psidev.cvs.sourceforge.net/*checkout*/psidev/psi/psi-ms/mzML/controlledVocabulary/psi-ms.obo\"/>\r\n<cv id=\"UO\" fullName=", result=0x0000000004acc510, sampleIndex=0, config=0x00007ffcdab7cb10) const + 181 at Reader.cpp:101
frame #9: 0x00007f2b01eea60f mzR.so`pwiz::msdata::(anonymous namespace)::(filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", msd=0x0000000004acc510, reader=0x00000000058e1170, head="<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>\r\n<mzML xmlns=\"http://psi.hupo.org/ms/mzml\" xsi:schemaLocation=\"http://psi.hupo.org/ms/mzml http://psidev.info/files/ms/mzML/xsd/mzML1.1.0.xsd\" version=\"1.1.0\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\">\r\n<cvList count=\"2\">\r\n<cv id=\"MS\" fullName=\"Proteomics Standards Initiative Mass Spectrometry Vocabularies\" version=\"2.26.0\" URI=\"http://psidev.cvs.sourceforge.net/*checkout*/psidev/psi/psi-ms/mzML/controlledVocabulary/psi-ms.obo\"/>\r\n<cv id=\"UO\" fullName=")(const string &const, pwiz::msdata::MSData &const, const pwiz::msdata::Reader &const, const string &const) + 127 at MSDataFile.cpp:61
frame #10: 0x00007f2b01eec0ba mzR.so`pwiz::msdata::MSDataFile::MSDataFile(this=0x0000000004acc510, filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", reader=<unavailable>, calculateSourceFileChecksum=<unavailable>) + 218 at MSDataFile.cpp:91
frame #11: 0x00007f2b01ee434a mzR.so`pwiz::msdata::RAMPAdapter::RAMPAdapter(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) [inlined] pwiz::msdata::RAMPAdapter::Impl::Impl(filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", this=0x0000000004acc510) + 5 at RAMPAdapter.cpp:49
frame #12: 0x00007f2b01ee4345 mzR.so`pwiz::msdata::RAMPAdapter::RAMPAdapter(this=0x00000000050b8110, filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML") + 37 at RAMPAdapter.cpp:296
frame #13: 0x00007f2b01d3df7e mzR.so`rampOpenFile(filename="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML") + 542 at ramp.cpp:284
frame #14: 0x00007f2b01d3cd15 mzR.so`cRamp::cRamp(this=0x00000000050e4ce0, fileName="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", declaredScansOnly=<unavailable>) + 149 at cramp.cpp:75
frame #15: 0x00007f2b01d45af8 mzR.so`RcppRamp::open(this=0x0000000004fd0ec0, fileName="/software/R_libs/R332_bioc34/msdata/microtofq/MM8.mzML", declaredScansOnly=<unavailable>) + 72 at RcppRamp.cpp:23
frame #16: 0x00007f2b01d5a955 mzR.so`Rcpp::CppMethod2<RcppRamp, void, char const*, bool>::operator(this=0x00000000050c9990, object=0x0000000004fd0ec0, args=<unavailable>)(RcppRamp*, SEXPREC**) + 245 at Module_generated_CppMethod.h:215
frame #17: 0x00007f2b01d570f0 mzR.so`Rcpp::class_<RcppRamp>::invoke_void(this=<unavailable>, method_xp=<unavailable>, object=0x0000000005cbe020, args=0x00007ffcdab7d2f0, nargs=<unavailable>) + 176 at class.h:212
frame #18: 0x00007f2b027e3f41 Rcpp.so`CppMethod__invoke_void(args=<unavailable>) + 449 at Module.cpp:200
frame #19: 0x00007f2b26a6e9b1 libR.so`do_External(call=0x000000000570f208, op=0x000000000333ac20, args=0x0000000005dd50b8, env=0x0000000005dd50f0) + 337 at dotcode.c:548
frame #20: 0x00007f2b26aa86df libR.so`Rf_eval(e=0x000000000570f208, rho=0x0000000005dd50f0) + 1871 at eval.c:713
frame #21: 0x00007f2b26aaadf8 libR.so`do_begin(call=0x000000000570f198, op=0x00000000033254d8, args=0x000000000570f358, rho=0x0000000005dd50f0) + 344 at eval.c:1807
frame #22: 0x00007f2b26aa84d1 libR.so`Rf_eval(e=<unavailable>, rho=0x0000000005dd50f0) + 1345 at eval.c:685
frame #23: 0x00007f2b26aa9d8d libR.so`Rf_applyClosure(call=<unavailable>, op=<unavailable>, arglist=<unavailable>, rho=<unavailable>, suppliedvars=0x000000000330a1c8) + 1309 at eval.c:1135
frame #24: 0x00007f2b26aa82ad libR.so`Rf_eval(e=0x00000000062d0ad8, rho=0x000000000595b3e0) + 797 at eval.c:732
frame #25: 0x00007f2b26aaadf8 libR.so`do_begin(call=0x00000000062d0d08, op=0x00000000033254d8, args=0x00000000062d0b10, rho=0x000000000595b3e0) + 344 at eval.c:1807
frame #26: 0x00007f2b26aa84d1 libR.so`Rf_eval(e=<unavailable>, rho=0x000000000595b3e0) + 1345 at eval.c:685
frame #27: 0x00007f2b26aa84d1 libR.so`Rf_eval(e=<unavailable>, rho=0x000000000595b3e0) + 1345 at eval.c:685
frame #28: 0x00007f2b26aaadf8 libR.so`do_begin(call=0x00000000062d04a0, op=0x00000000033254d8, args=0x00000000062d0e90, rho=0x000000000595b3e0) + 344 at eval.c:1807
frame #29: 0x00007f2b26aa84d1 libR.so`Rf_eval(e=<unavailable>, rho=0x000000000595b3e0) + 1345 at eval.c:685
frame #30: 0x00007f2b26aa9d8d libR.so`Rf_applyClosure(call=<unavailable>, op=<unavailable>, arglist=<unavailable>, rho=<unavailable>, suppliedvars=0x000000000330a1c8) + 1309 at eval.c:1135
frame #31: 0x00007f2b26aa82ad libR.so`Rf_eval(e=0x000000000595a9f0, rho=0x000000000334ff58) + 797 at eval.c:732
frame #32: 0x00007f2b26aabf66 libR.so`do_set(call=0x000000000595b680, op=0x000000000330c2d8, args=<unavailable>, rho=0x000000000334ff58) + 166 at eval.c:2197
frame #33: 0x00007f2b26aa84d1 libR.so`Rf_eval(e=<unavailable>, rho=0x000000000334ff58) + 1345 at eval.c:685
frame #34: 0x00007f2b26acf932 libR.so`Rf_ReplIteration(rho=0x000000000334ff58, savestack=<unavailable>, browselevel=<unavailable>, state=0x00007ffcdab7f020) + 546 at main.c:258
frame #35: 0x00007f2b26acfca1 libR.so`R_ReplConsole(rho=0x000000000334ff58, savestack=0, browselevel=0) + 129 at main.c:308
frame #36: 0x00007f2b26acfd58 libR.so`run_Rmainloop + 72 at main.c:1059
frame #37: 0x0000000000d85dc2 rsession`rstudio::r::session::runEmbeddedR(rstudio::core::FilePath const&, rstudio::core::FilePath const&, bool, bool, SA_TYPE, rstudio::r::session::Callbacks const&, rstudio::r::session::InternalCallbacks*) + 434
frame #38: 0x0000000000d6791d rsession`rstudio::r::session::run(rstudio::r::session::ROptions const&, rstudio::r::session::RCallbacks const&) + 9581
frame #39: 0x00000000006a7299 rsession`main + 10809
frame #40: 0x00007f2b2530e830 libc.so.6`__libc_start_main(main=(rsession`main), argc=11, argv=0x00007ffcdab81fd8, init=<unavailable>, fini=<unavailable>, rtld_fini=<unavailable>, stack_end=0x00007ffcdab81fc8) + 240 at libc-start.c:291
frame #41: 0x00000000006b4fe1 rsession`_start + 41
Which definitely looks like a memory problem on the Proteowizard side, that somehow just gets ignored in the base R
process.
When running this with a version of R compiled with sanitizers, I see:
> source("msdata.R", echo = TRUE)
> library(mzR)
Loading required package: Rcpp
> library(msdata)
> mzxml <- system.file("threonine/threonine_i2_e35_pH_tree.mzXML",
+ package = "msdata")
> aa <- openMSfile(mzxml) # this works
ramp.cpp:1197:34: runtime error: index -1 out of bounds for type 'char [513]'
SUMMARY: AddressSanitizer: undefined-behavior ramp.cpp:1197:34 in
> mzml <- system.file("microtofq/MM8.mzML", package = "msdata")
> bb <- openMSfile(mzml) # this crashes R, but only in RStudio
In particular, this bit:
> aa <- openMSfile(mzxml) # this works
ramp.cpp:1197:34: runtime error: index -1 out of bounds for type 'char [513]'
SUMMARY: AddressSanitizer: undefined-behavior ramp.cpp:1197:34 in
implies there may be something going wrong with the openMSfile()
function -- it appears to be attempting to read data at an invalid offset. I would file this issue with the mzR maintainers.
> sessionInfo()
R Under development (unstable) (2017-01-17 r72002)
Platform: x86_64-apple-darwin16.3.0 (64-bit)
Running under: macOS Sierra 10.12.2
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] msdata_0.15.0 mzR_2.9.3 Rcpp_0.12.9
[4] testthat_1.0.2 rmarkdown_1.3 knitr_1.15.1
[7] roxygen2_5.0.1 devtools_1.12.0.9000
loaded via a namespace (and not attached):
[1] magrittr_1.5 BiocGenerics_0.21.3 pkgload_0.0.0.9000
[4] R6_2.2.0 stringr_1.1.0 tools_3.4.0
[7] pkgbuild_0.0.0.9000 parallel_3.4.0 Biobase_2.35.0
[10] withr_1.0.2 htmltools_0.3.5 ProtGenerics_1.7.0
[13] rprojroot_1.2 digest_0.6.11 crayon_1.3.2
[16] codetools_0.2-15 memoise_1.0.0 evaluate_0.10
[19] stringi_1.1.2 compiler_3.4.0 backports_1.0.5
EDIT: Here's the configure call used to build R; the same compiler + flags are used for all built packages as well. (Extracted from R.home("etc/Makeconf")
)
# R was configured using the following call
# (not including env. vars and site configuration)
# configure '--with-blas=-L/usr/local/opt/openblas/lib -lopenblas' '--with-lapack=-L/usr/local/opt/lapack/lib -llapack' '--with-cairo' '--disable-R-framework' '--enable-R-shlib' '--with-readline' '--enable-R-profiling' '--enable-memory-profiling' '--with-valgrind-instrumentation=2' '--without-internal-tzcode' '--prefix=/Users/kevin/r/r-devel-sanitizers' 'PKG_CONFIG_PATH=/opt/X11/lib/pkgconfig'
CC = clang-3.9 -std=gnu99 -fsanitize=address,undefined -fno-omit-frame-pointer -fno-sanitize=float-divide-by-zero
CXX = clang++-3.9 -fsanitize=address,undefined -fno-omit-frame-pointer -fno-sanitize=float-divide-by-zero
EDIT V2: Given the lldb
stack trace, it looks like the culprit may indeed be a clash between Boost versions (the bundled version used by RStudio vs. the version used by mzR
). Note that mzR
is now inadvertently calling the Boost routines in the rsession
executable, when it likely intends to call its own bundled version.
User contributions licensed under CC BY-SA 3.0