`tiny_malloc_from_free_list` made my pointer `NULL`?

11

I am working on code that includes bllipparser Python module, among other things. Feeding it the same dataset, it will intermittently crash (maybe once in three to ten runs). Going through lldb, I found that the public field weights of RerankerModel (source), that is apparently only set once (in the constructor), randomly becomes NULL (I only have one RerankerModel for the duration of my run, so there should be exactly one weights, that persists unchanged throughout). So I set up an ambush (I mean, a watchpoint: I stopped the code in the constructor and watchpoint set expression -w write -- &weights), and apparently the culprit that nulls the pointer is tiny_malloc_from_free_list from libsystem_malloc.dylib. Here's the relevant top of the backtrace:

* thread #1, queue = 'com.apple.main-thread', stop reason = watchpoint 4
  * frame #0: 0x00007fff61caf22a libsystem_malloc.dylib`tiny_malloc_from_free_list + 151
    frame #1: 0x00007fff61cae3bf libsystem_malloc.dylib`szone_malloc_should_clear + 422
    frame #2: 0x00007fff61cae1bd libsystem_malloc.dylib`malloc_zone_malloc + 103
    frame #3: 0x00007fff61cad4c7 libsystem_malloc.dylib`malloc + 24
    frame #4: 0x00007fff5faac628 libc++abi.dylib`operator new(unsigned long) + 40
    frame #5: 0x00000001133c904c _CharniakParser.cpython-36m-darwin.so`std::__1::__split_buffer<short, std::__1::allocator<short>&>::__split_buffer(unsigned long, unsigned long, std::__1::allocator<short>&) [inlined] std::__1::__allocate(__size=4) at new:226
    frame #6: 0x00000001133c9040 _CharniakParser.cpython-36m-darwin.so`std::__1::__split_buffer<short, std::__1::allocator<short>&>::__split_buffer(unsigned long, unsigned long, std::__1::allocator<short>&) [inlined] std::__1::allocator<short>::allocate(this=0x0000000135316448, __n=2, (null)=0x0000000000000000) at memory:1747
    frame #7: 0x00000001133c8f44 _CharniakParser.cpython-36m-darwin.so`std::__1::__split_buffer<short, std::__1::allocator<short>&>::__split_buffer(unsigned long, unsigned long, std::__1::allocator<short>&) [inlined] std::__1::allocator_traits<std::__1::allocator<short> >::allocate(__a=0x0000000135316448, __n=2) at memory:1502
    frame #8: 0x00000001133c8f16 _CharniakParser.cpython-36m-darwin.so`std::__1::__split_buffer<short, std::__1::allocator<short>&>::__split_buffer(this=0x00007ffeefbf3b48, __cap=2, __start=1, __a=0x0000000135316448) at __split_buffer:311
    frame #9: 0x00000001133c878d _CharniakParser.cpython-36m-darwin.so`std::__1::__split_buffer<short, std::__1::allocator<short>&>::__split_buffer(this=0x00007ffeefbf3b48, __cap=2, __start=1, __a=0x0000000135316448) at __split_buffer:310
    frame #10: 0x00000001133c869b _CharniakParser.cpython-36m-darwin.so`void std::__1::vector<short, std::__1::allocator<short> >::__push_back_slow_path<short const>(this=0x0000000135316438 size=1, __x=0x00007ffeefbf3caa) at vector:1567
    frame #11: 0x00000001133c4446 _CharniakParser.cpython-36m-darwin.so`Val::extendTrees(Bst&, int) [inlined] std::__1::vector<short, std::__1::allocator<short> >::push_back(this=0x0000000135316438 size=1, __x=0x00007ffeefbf3caa) at vector:1588

I am very much not an expert on C++, but... How is the allocator NULLing the pointer? How does the allocator even know where the pointer is? Why is the allocator NULLing the pointer? I mean, I can see how I might run out of memory, what I was doing wasn't exactly memory-light, but I'd much sooner expect allocator to fail to allocate than to randomly deallocate something - and I had no idea it could NULL a pointer. Can anyone explain to me exactly how this works, why it happened, how it happened, why is it always the same pointer in a code that has lots and lots of other juicy pointers, and what I can do to make it not happen?

Addendum if needed: Here's where the actual NULLing takes place, the code of tiny_malloc_from_free_list, if anyone can make sense of it...

libsystem_malloc.dylib`tiny_malloc_from_free_list:
    0x7fff61caf193 <+0>:    pushq  %rbp
    0x7fff61caf194 <+1>:    movq   %rsp, %rbp
    0x7fff61caf197 <+4>:    pushq  %r15
    0x7fff61caf199 <+6>:    pushq  %r14
    0x7fff61caf19b <+8>:    pushq  %r13
    0x7fff61caf19d <+10>:   pushq  %r12
    0x7fff61caf19f <+12>:   pushq  %rbx
    0x7fff61caf1a0 <+13>:   pushq  %rax
    0x7fff61caf1a1 <+14>:   movl   %edx, %r15d
    0x7fff61caf1a4 <+17>:   movq   %rsi, %r14
    0x7fff61caf1a7 <+20>:   movq   %rdi, %r12
    0x7fff61caf1aa <+23>:   leal   -0x1(%r15), %ecx
    0x7fff61caf1ae <+27>:   movq   0x18(%r14,%rcx,8), %r13
    0x7fff61caf1b3 <+32>:   testq  %r13, %r13
    0x7fff61caf1b6 <+35>:   je     0x7fff61caf22f            ; <+156>
    0x7fff61caf1b8 <+37>:   movq   0x8(%r13), %rdx
    0x7fff61caf1bc <+41>:   movq   %rdx, %rax
    0x7fff61caf1bf <+44>:   shlq   $0x4, %rax
    0x7fff61caf1c3 <+48>:   shrq   $0x3c, %rdx
    0x7fff61caf1c7 <+52>:   movq   0x278(%r12), %rsi
    0x7fff61caf1cf <+60>:   xorq   %rax, %rsi
    0x7fff61caf1d2 <+63>:   movq   %rsi, %rdi
    0x7fff61caf1d5 <+66>:   shrq   $0x8, %rdi
    0x7fff61caf1d9 <+70>:   addl   %esi, %edi
    0x7fff61caf1db <+72>:   movq   %rsi, %rbx
    0x7fff61caf1de <+75>:   shrq   $0x10, %rbx
    0x7fff61caf1e2 <+79>:   addl   %edi, %ebx
    0x7fff61caf1e4 <+81>:   movq   %rsi, %rdi
    0x7fff61caf1e7 <+84>:   shrq   $0x18, %rdi
    0x7fff61caf1eb <+88>:   addl   %ebx, %edi
    0x7fff61caf1ed <+90>:   movq   %rsi, %rbx
    0x7fff61caf1f0 <+93>:   shrq   $0x20, %rbx
    0x7fff61caf1f4 <+97>:   addl   %edi, %ebx
    0x7fff61caf1f6 <+99>:   movq   %rsi, %rdi
    0x7fff61caf1f9 <+102>:  shrq   $0x28, %rdi
    0x7fff61caf1fd <+106>:  addl   %ebx, %edi
    0x7fff61caf1ff <+108>:  movq   %rsi, %rbx
    0x7fff61caf202 <+111>:  shrq   $0x30, %rbx
    0x7fff61caf206 <+115>:  addl   %edi, %ebx
    0x7fff61caf208 <+117>:  shrq   $0x38, %rsi
    0x7fff61caf20c <+121>:  addl   %ebx, %esi
    0x7fff61caf20e <+123>:  andl   $0xf, %esi
    0x7fff61caf211 <+126>:  cmpq   %rsi, %rdx
    0x7fff61caf214 <+129>:  jne    0x7fff61caf602            ; <+1135>
    0x7fff61caf21a <+135>:  testq  %rax, %rax
    0x7fff61caf21d <+138>:  je     0x7fff61caf2de            ; <+331>
    0x7fff61caf223 <+144>:  movq   (%r13), %rdx
    0x7fff61caf227 <+148>:  movq   %rdx, (%rax)
->  0x7fff61caf22a <+151>:  jmp    0x7fff61caf2f2            ; <+351>
    0x7fff61caf22f <+156>:  movq   $-0x1, %rax
    0x7fff61caf236 <+163>:  shlq   %cl, %rax
    0x7fff61caf239 <+166>:  andq   0x818(%r14), %rax
    0x7fff61caf240 <+173>:  je     0x7fff61caf4f9            ; <+870>
    0x7fff61caf246 <+179>:  bsfq   %rax, %rcx
    0x7fff61caf24a <+183>:  cmpq   $0x3f, %rcx
    0x7fff61caf24e <+187>:  je     0x7fff61caf39d            ; <+522>
    0x7fff61caf254 <+193>:  movq   0x18(%r14,%rcx,8), %r13
    0x7fff61caf259 <+198>:  testq  %r13, %r13
    0x7fff61caf25c <+201>:  je     0x7fff61caf39d            ; <+522>
    0x7fff61caf262 <+207>:  movq   0x8(%r13), %rdx
    0x7fff61caf266 <+211>:  movq   %rdx, %rax
    0x7fff61caf269 <+214>:  shlq   $0x4, %rax
    0x7fff61caf26d <+218>:  shrq   $0x3c, %rdx
    0x7fff61caf271 <+222>:  movq   0x278(%r12), %rsi
    0x7fff61caf279 <+230>:  xorq   %rax, %rsi
    0x7fff61caf27c <+233>:  movq   %rsi, %rdi
    0x7fff61caf27f <+236>:  shrq   $0x8, %rdi
    0x7fff61caf283 <+240>:  addl   %esi, %edi
    0x7fff61caf285 <+242>:  movq   %rsi, %rbx
    0x7fff61caf288 <+245>:  shrq   $0x10, %rbx
    0x7fff61caf28c <+249>:  addl   %edi, %ebx
    0x7fff61caf28e <+251>:  movq   %rsi, %rdi
    0x7fff61caf291 <+254>:  shrq   $0x18, %rdi
    0x7fff61caf295 <+258>:  addl   %ebx, %edi
    0x7fff61caf297 <+260>:  movq   %rsi, %rbx
    0x7fff61caf29a <+263>:  shrq   $0x20, %rbx
    0x7fff61caf29e <+267>:  addl   %edi, %ebx
    0x7fff61caf2a0 <+269>:  movq   %rsi, %rdi
    0x7fff61caf2a3 <+272>:  shrq   $0x28, %rdi
    0x7fff61caf2a7 <+276>:  addl   %ebx, %edi
    0x7fff61caf2a9 <+278>:  movq   %rsi, %rbx
    0x7fff61caf2ac <+281>:  shrq   $0x30, %rbx
    0x7fff61caf2b0 <+285>:  addl   %edi, %ebx
    0x7fff61caf2b2 <+287>:  shrq   $0x38, %rsi
    0x7fff61caf2b6 <+291>:  addl   %ebx, %esi
    0x7fff61caf2b8 <+293>:  andl   $0xf, %esi
    0x7fff61caf2bb <+296>:  cmpq   %rsi, %rdx
    0x7fff61caf2be <+299>:  jne    0x7fff61caf602            ; <+1135>
    0x7fff61caf2c4 <+305>:  movq   %rax, 0x18(%r14,%rcx,8)
    0x7fff61caf2c9 <+310>:  testq  %rax, %rax
    0x7fff61caf2cc <+313>:  je     0x7fff61caf59f            ; <+1036>
    0x7fff61caf2d2 <+319>:  movq   (%r13), %rcx
    0x7fff61caf2d6 <+323>:  movq   %rcx, (%rax)
    0x7fff61caf2d9 <+326>:  jmp    0x7fff61caf5b2            ; <+1055>
    0x7fff61caf2de <+331>:  movl   $0xfffffffe, %edx         ; imm = 0xFFFFFFFE 
    0x7fff61caf2e3 <+336>:  roll   %cl, %edx
    0x7fff61caf2e5 <+338>:  movl   %ecx, %esi
    0x7fff61caf2e7 <+340>:  shrl   $0x5, %esi
    0x7fff61caf2ea <+343>:  andl   %edx, 0x818(%r14,%rsi,4)
    0x7fff61caf2f2 <+351>:  movq   %rax, 0x18(%r14,%rcx,8)
    0x7fff61caf2f7 <+356>:  incl   0x850(%r14)
    0x7fff61caf2fe <+363>:  movzwl %r15w, %esi
    0x7fff61caf302 <+367>:  movl   %esi, %ecx
    0x7fff61caf304 <+369>:  shll   $0x4, %ecx
    0x7fff61caf307 <+372>:  addq   %rcx, 0x858(%r14)
    0x7fff61caf30e <+379>:  movq   %r13, %rax
    0x7fff61caf311 <+382>:  andq   $-0x100000, %rax          ; imm = 0xFFF00000 
    0x7fff61caf317 <+388>:  addl   0xfc098(%rax), %ecx
    0x7fff61caf31d <+394>:  movl   %ecx, 0xfc098(%rax)
    0x7fff61caf323 <+400>:  cmpl   $0xbd060, %ecx            ; imm = 0xBD060 
    0x7fff61caf329 <+406>:  jb     0x7fff61caf335            ; <+418>
    0x7fff61caf32b <+408>:  movl   $0x0, 0xfc090(%rax)
    0x7fff61caf335 <+418>:  cmpl   $0x2, %esi
    0x7fff61caf338 <+421>:  jb     0x7fff61caf344            ; <+433>
    0x7fff61caf33a <+423>:  movq   %r13, %rdi
    0x7fff61caf33d <+426>:  callq  0x7fff61cc5eaa            ; set_tiny_meta_header_in_use
    0x7fff61caf342 <+431>:  jmp    0x7fff61caf38b            ; <+504>
    0x7fff61caf344 <+433>:  movq   %r13, %rcx
    0x7fff61caf347 <+436>:  shrq   $0x4, %rcx
    0x7fff61caf34b <+440>:  movl   %r13d, %edx
    0x7fff61caf34e <+443>:  shrl   $0x8, %edx
    0x7fff61caf351 <+446>:  andl   $0xffe, %edx              ; imm = 0xFFE 
    0x7fff61caf357 <+452>:  movl   $0x1, %esi
    0x7fff61caf35c <+457>:  movl   $0x1, %edi
    0x7fff61caf361 <+462>:  shll   %cl, %edi
    0x7fff61caf363 <+464>:  orl    %edi, 0xfc0a0(%rax,%rdx,4)
    0x7fff61caf36a <+471>:  orl    $0x1, %edx
    0x7fff61caf36d <+474>:  orl    %edi, 0xfc0a0(%rax,%rdx,4)
    0x7fff61caf374 <+481>:  leal   0x1(%rcx), %ecx
    0x7fff61caf377 <+484>:  movl   %ecx, %edx
    0x7fff61caf379 <+486>:  shrl   $0x4, %edx
    0x7fff61caf37c <+489>:  andl   $0xffe, %edx              ; imm = 0xFFE 
    0x7fff61caf382 <+495>:  shll   %cl, %esi
    0x7fff61caf384 <+497>:  orl    %esi, 0xfc0a0(%rax,%rdx,4)
    0x7fff61caf38b <+504>:  movq   %r13, %rax
    0x7fff61caf38e <+507>:  addq   $0x8, %rsp
    0x7fff61caf392 <+511>:  popq   %rbx
    0x7fff61caf393 <+512>:  popq   %r12
    0x7fff61caf395 <+514>:  popq   %r13
    0x7fff61caf397 <+516>:  popq   %r14
    0x7fff61caf399 <+518>:  popq   %r15
    0x7fff61caf39b <+520>:  popq   %rbp
    0x7fff61caf39c <+521>:  retq   
    0x7fff61caf39d <+522>:  movq   0x210(%r14), %r13
    0x7fff61caf3a4 <+529>:  testq  %r13, %r13
    0x7fff61caf3a7 <+532>:  je     0x7fff61caf4f9            ; <+870>
    0x7fff61caf3ad <+538>:  movq   %r13, %rdi
    0x7fff61caf3b0 <+541>:  callq  0x7fff61cb05ab            ; get_tiny_free_size
    0x7fff61caf3b5 <+546>:  movq   0x8(%r13), %rcx
    0x7fff61caf3b9 <+550>:  movq   %rcx, %r8
    0x7fff61caf3bc <+553>:  shlq   $0x4, %r8
    0x7fff61caf3c0 <+557>:  shrq   $0x3c, %rcx
    0x7fff61caf3c4 <+561>:  movq   0x278(%r12), %rsi
    0x7fff61caf3cc <+569>:  movq   %r8, %rdi
    0x7fff61caf3cf <+572>:  xorq   %rsi, %rdi
    0x7fff61caf3d2 <+575>:  movq   %rdi, %rbx
    0x7fff61caf3d5 <+578>:  shrq   $0x8, %rbx
    0x7fff61caf3d9 <+582>:  addl   %edi, %ebx
    0x7fff61caf3db <+584>:  movq   %rdi, %rdx
    0x7fff61caf3de <+587>:  shrq   $0x10, %rdx
    0x7fff61caf3e2 <+591>:  addl   %ebx, %edx
    0x7fff61caf3e4 <+593>:  movq   %rdi, %rbx
    0x7fff61caf3e7 <+596>:  shrq   $0x18, %rbx
    0x7fff61caf3eb <+600>:  addl   %edx, %ebx
    0x7fff61caf3ed <+602>:  movq   %rdi, %rdx
    0x7fff61caf3f0 <+605>:  shrq   $0x20, %rdx
    0x7fff61caf3f4 <+609>:  addl   %ebx, %edx
    0x7fff61caf3f6 <+611>:  movq   %rdi, %rbx
    0x7fff61caf3f9 <+614>:  shrq   $0x28, %rbx
    0x7fff61caf3fd <+618>:  addl   %edx, %ebx
    0x7fff61caf3ff <+620>:  movq   %rdi, %rdx
    0x7fff61caf402 <+623>:  shrq   $0x30, %rdx
    0x7fff61caf406 <+627>:  addl   %ebx, %edx
    0x7fff61caf408 <+629>:  shrq   $0x38, %rdi
    0x7fff61caf40c <+633>:  addl   %edx, %edi
    0x7fff61caf40e <+635>:  andl   $0xf, %edi
    0x7fff61caf411 <+638>:  cmpq   %rdi, %rcx
    0x7fff61caf414 <+641>:  jne    0x7fff61caf602            ; <+1135>
    0x7fff61caf41a <+647>:  movzwl %ax, %edi
    0x7fff61caf41d <+650>:  subl   %r15d, %edi
    0x7fff61caf420 <+653>:  cmpl   $0x40, %edi
    0x7fff61caf423 <+656>:  jl     0x7fff61caf58a            ; <+1015>
    0x7fff61caf429 <+662>:  movl   %r15d, %eax
    0x7fff61caf42c <+665>:  shll   $0x4, %eax
    0x7fff61caf42f <+668>:  addq   %r13, %rax
    0x7fff61caf432 <+671>:  movq   %rax, 0x210(%r14)
    0x7fff61caf439 <+678>:  movq   %rax, %rcx
    0x7fff61caf43c <+681>:  shrq   $0x4, %rcx
    0x7fff61caf440 <+685>:  testq  %r8, %r8
    0x7fff61caf443 <+688>:  je     0x7fff61caf48e            ; <+763>
    0x7fff61caf445 <+690>:  xorq   %rax, %rsi
    0x7fff61caf448 <+693>:  movq   %rsi, %rdx
    0x7fff61caf44b <+696>:  shrq   $0x8, %rdx
    0x7fff61caf44f <+700>:  addl   %esi, %edx
    0x7fff61caf451 <+702>:  movq   %rsi, %rbx
    0x7fff61caf454 <+705>:  shrq   $0x10, %rbx
    0x7fff61caf458 <+709>:  addl   %edx, %ebx
    0x7fff61caf45a <+711>:  movq   %rsi, %rdx
    0x7fff61caf45d <+714>:  shrq   $0x18, %rdx
    0x7fff61caf461 <+718>:  addl   %ebx, %edx
    0x7fff61caf463 <+720>:  movq   %rsi, %rbx
    0x7fff61caf466 <+723>:  shrq   $0x20, %rbx
    0x7fff61caf46a <+727>:  addl   %edx, %ebx
    0x7fff61caf46c <+729>:  movq   %rsi, %rdx
    0x7fff61caf46f <+732>:  shrq   $0x28, %rdx
    0x7fff61caf473 <+736>:  addl   %ebx, %edx
    0x7fff61caf475 <+738>:  movq   %rsi, %rbx
    0x7fff61caf478 <+741>:  shrq   $0x30, %rbx
    0x7fff61caf47c <+745>:  addl   %edx, %ebx
    0x7fff61caf47e <+747>:  shrq   $0x38, %rsi
    0x7fff61caf482 <+751>:  addl   %ebx, %esi
    0x7fff61caf484 <+753>:  shlq   $0x3c, %rsi
    0x7fff61caf488 <+757>:  orq    %rcx, %rsi
    0x7fff61caf48b <+760>:  movq   %rsi, (%r8)
    0x7fff61caf48e <+763>:  movq   (%r13), %rdx
    0x7fff61caf492 <+767>:  movq   %rdx, (%rax)
    0x7fff61caf495 <+770>:  movq   0x8(%r13), %rdx
    0x7fff61caf499 <+774>:  movq   %rdx, 0x8(%rax)
    0x7fff61caf49d <+778>:  movq   %rax, %rdx
    0x7fff61caf4a0 <+781>:  andq   $-0x100000, %rdx          ; imm = 0xFFF00000 
    0x7fff61caf4a7 <+788>:  movl   %eax, %esi
    0x7fff61caf4a9 <+790>:  shrl   $0x8, %esi
    0x7fff61caf4ac <+793>:  andl   $0xffe, %esi              ; imm = 0xFFE 
    0x7fff61caf4b2 <+799>:  andl   $0x1f, %ecx
    0x7fff61caf4b5 <+802>:  movl   $0x1, %ebx
    0x7fff61caf4ba <+807>:  shll   %cl, %ebx
    0x7fff61caf4bc <+809>:  orl    %ebx, 0xfc0a0(%rdx,%rsi,4)
    0x7fff61caf4c3 <+816>:  movl   $0xfffffffe, %ebx         ; imm = 0xFFFFFFFE 
    0x7fff61caf4c8 <+821>:  roll   %cl, %ebx
    0x7fff61caf4ca <+823>:  orl    $0x1, %esi
    0x7fff61caf4cd <+826>:  andl   %ebx, 0xfc0a0(%rdx,%rsi,4)
    0x7fff61caf4d4 <+833>:  movzwl %di, %ecx
    0x7fff61caf4d7 <+836>:  cmpl   $0x2, %ecx
    0x7fff61caf4da <+839>:  jb     0x7fff61caf5ee            ; <+1115>
    0x7fff61caf4e0 <+845>:  movl   %edi, %ecx
    0x7fff61caf4e2 <+847>:  shll   $0x4, %ecx
    0x7fff61caf4e5 <+850>:  andl   $0xffff0, %ecx            ; imm = 0xFFFF0 
    0x7fff61caf4eb <+856>:  movw   %di, -0x2(%rax,%rcx)
    0x7fff61caf4f0 <+861>:  movw   %di, 0x10(%rax)
    0x7fff61caf4f4 <+865>:  jmp    0x7fff61caf2f7            ; <+356>
    0x7fff61caf4f9 <+870>:  movq   0x838(%r14), %rcx
    0x7fff61caf500 <+877>:  movl   %r15d, %eax
    0x7fff61caf503 <+880>:  shll   $0x4, %eax
    0x7fff61caf506 <+883>:  movq   %rcx, %rdx
    0x7fff61caf509 <+886>:  subq   %rax, %rdx
    0x7fff61caf50c <+889>:  jae    0x7fff61caf516            ; <+899>
    0x7fff61caf50e <+891>:  xorl   %r13d, %r13d
    0x7fff61caf511 <+894>:  jmp    0x7fff61caf38b            ; <+504>
    0x7fff61caf516 <+899>:  movl   $0xfc080, %r13d           ; imm = 0xFC080 
    0x7fff61caf51c <+905>:  subq   %rcx, %r13
    0x7fff61caf51f <+908>:  addq   0x848(%r14), %r13
    0x7fff61caf526 <+915>:  movq   %rdx, 0x838(%r14)
    0x7fff61caf52d <+922>:  testq  %rdx, %rdx
    0x7fff61caf530 <+925>:  je     0x7fff61caf2f7            ; <+356>
    0x7fff61caf536 <+931>:  addq   %r13, %rax
    0x7fff61caf539 <+934>:  movq   %rax, %rdx
    0x7fff61caf53c <+937>:  andq   $-0x100000, %rdx          ; imm = 0xFFF00000 
    0x7fff61caf543 <+944>:  movq   %rax, %rcx
    0x7fff61caf546 <+947>:  shrq   $0x4, %rcx
    0x7fff61caf54a <+951>:  shrl   $0x8, %eax
    0x7fff61caf54d <+954>:  andl   $0xffe, %eax              ; imm = 0xFFE 
    0x7fff61caf552 <+959>:  movl   $0x1, %esi
    0x7fff61caf557 <+964>:  movl   $0x1, %edi
    0x7fff61caf55c <+969>:  shll   %cl, %edi
    0x7fff61caf55e <+971>:  orl    %edi, 0xfc0a0(%rdx,%rax,4)
    0x7fff61caf565 <+978>:  orl    $0x1, %eax
    0x7fff61caf568 <+981>:  orl    %edi, 0xfc0a0(%rdx,%rax,4)
    0x7fff61caf56f <+988>:  leal   0x1(%rcx), %ecx
    0x7fff61caf572 <+991>:  movl   %ecx, %eax
    0x7fff61caf574 <+993>:  shrl   $0x4, %eax
    0x7fff61caf577 <+996>:  andl   $0xffe, %eax              ; imm = 0xFFE 
    0x7fff61caf57c <+1001>: shll   %cl, %esi
    0x7fff61caf57e <+1003>: orl    %esi, 0xfc0a0(%rdx,%rax,4)
    0x7fff61caf585 <+1010>: jmp    0x7fff61caf2f7            ; <+356>
    0x7fff61caf58a <+1015>: testq  %r8, %r8
    0x7fff61caf58d <+1018>: je     0x7fff61caf596            ; <+1027>
    0x7fff61caf58f <+1020>: movq   (%r13), %rcx
    0x7fff61caf593 <+1024>: movq   %rcx, (%r8)
    0x7fff61caf596 <+1027>: movq   %r8, 0x210(%r14)
    0x7fff61caf59d <+1034>: jmp    0x7fff61caf5ba            ; <+1063>
    0x7fff61caf59f <+1036>: movl   $0xfffffffe, %eax         ; imm = 0xFFFFFFFE 
    0x7fff61caf5a4 <+1041>: roll   %cl, %eax
    0x7fff61caf5a6 <+1043>: shrq   $0x5, %rcx
    0x7fff61caf5aa <+1047>: andl   %eax, 0x818(%r14,%rcx,4)
    0x7fff61caf5b2 <+1055>: movq   %r13, %rdi
    0x7fff61caf5b5 <+1058>: callq  0x7fff61cb05ab            ; get_tiny_free_size
    0x7fff61caf5ba <+1063>: leal   -0x1(%rax), %ecx
    0x7fff61caf5bd <+1066>: cmpw   %r15w, %cx
    0x7fff61caf5c1 <+1070>: jae    0x7fff61caf5cc            ; <+1081>
    0x7fff61caf5c3 <+1072>: movw   %ax, %r15w
    0x7fff61caf5c7 <+1076>: jmp    0x7fff61caf2f7            ; <+356>
    0x7fff61caf5cc <+1081>: subl   %r15d, %eax
    0x7fff61caf5cf <+1084>: movl   %r15d, %ecx
    0x7fff61caf5d2 <+1087>: shll   $0x4, %ecx
    0x7fff61caf5d5 <+1090>: movq   %r13, %rdx
    0x7fff61caf5d8 <+1093>: addq   %rcx, %rdx
    0x7fff61caf5db <+1096>: movzwl %ax, %ecx
    0x7fff61caf5de <+1099>: movq   %r12, %rdi
    0x7fff61caf5e1 <+1102>: movq   %r14, %rsi
    0x7fff61caf5e4 <+1105>: callq  0x7fff61cafe1b            ; tiny_free_list_add_ptr
    0x7fff61caf5e9 <+1110>: jmp    0x7fff61caf2f7            ; <+356>
    0x7fff61caf5ee <+1115>: testw  %di, %di
    0x7fff61caf5f1 <+1118>: jne    0x7fff61caf2f7            ; <+356>
    0x7fff61caf5f7 <+1124>: movw   $0x0, 0x10(%rax)
    0x7fff61caf5fd <+1130>: jmp    0x7fff61caf2f7            ; <+356>
    0x7fff61caf602 <+1135>: addq   $0x8, %r13
    0x7fff61caf606 <+1139>: movl   0x26c(%r12), %edi
    0x7fff61caf60e <+1147>: movq   %r13, %rsi
    0x7fff61caf611 <+1150>: callq  0x7fff61cc5505            ; free_list_checksum_botch.352
    0x7fff61caf616 <+1155>: ud2    
    0x7fff61caf618 <+1157>: nop    

The offender is 0x7fff61caf227 <+148>: movq %rdx, (%rax), where rax is the address of my weights pointer that gets nulled, and rdx is 0.

python
macos
memory-management
lldb
bllip-parser
asked on Stack Overflow Aug 2, 2018 by Amadan

1 Answer

0

That should be:

  1. An escape sequence \0 is present in the dataset. The size of the string is calculated with python:

    >>> len("a\0aa")
    4
    
  2. Then, the string is passed to c++ (CharniakParser) and we loop through it to parse it:

    string a = "a\0aa";
    const char* b = a.c_str();
    cout << a.size() << endl; // size == 1, \0 is the end of a string
    for(size_t i=0; i<4; i++)// 4 is the string size calculated with python
    {
        const char* c = &b[i];
        do_something_with(c); // c is corrupt after i == 0
    }
    
  3. Because you are close to be out of memory, you corrupted pointer c shall points to a not null address (the assumption of not always the same pointer nullified), the object pointed is removed by an operation, and so in an other portion of your code, you have a pointer pointing to that object that is removed --> this is your bug

I realize that my explanation is far-fetched, but we haven't that much datas.

answered on Stack Overflow Sep 13, 2018 by Charlie Lutaud

User contributions licensed under CC BY-SA 3.0