I have written a chess-engine with a friend which plays at the Top Chess Engine Championship (TCEC). We just placed first in the Qualification league even tho our engine has crashed in one game which was acoounted as a loss. I do know the basics about programming in C++ but I am stuck at analysing the resulting core dump.
The reason I cannot debug on Linux is because I do not have a Linux machine. the engine was running on a 176-core linux machine hosted by TCEC.
I would like to get the memory representation of the Board* board
object which has been passed to the getWDL(Board* board)
function
We have received the following information by the admins of TCEC.
Core was generated by `./Koivisto_4.44-x64-linux-native'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x000000000040c039 in probe_table((anonymous namespace)::Pos const*, int, int*, int) ()
[Current thread is 1 (LWP 3364456)]
(gdb) bt
#0 0x000000000040c039 in probe_table((anonymous namespace)::Pos const*, int, int*, int) ()
#1 0x000000000040cb4f in probe_ab((anonymous namespace)::Pos const*, int, int, int*) ()
#2 0x000000000040cc56 in probe_wdl((anonymous namespace)::Pos*, int*) [clone .lto_priv.167] ()
#3 0x0000000000411029 in getWDL(Board*) [clone .part.11] ()
#4 0x0000000000419a71 in pvSearch(Board*, short, short, unsigned char, unsigned char, ThreadData*, unsigned int, unsigned char*) ()
#5 0x000000000041a0f2 in pvSearch(Board*, short, short, unsigned char, unsigned char, ThreadData*, unsigned int, unsigned char*) ()
#6 0x000000000041a52d in pvSearch(Board*, short, short, unsigned char, unsigned char, ThreadData*, unsigned int, unsigned char*) ()
#7 0x000000000041a52d in pvSearch(Board*, short, short, unsigned char, unsigned char, ThreadData*, unsigned int, unsigned char*) ()
#8 0x000000000041a0f2 in pvSearch(Board*, short, short, unsigned char, unsigned char, ThreadData*, unsigned int, unsigned char*) ()
(gdb) info registers
rax 0xfffffffffffffffa -6
rbx 0x7f66a8ff4a10 140078898694672
rcx 0xb87 2951
rdx 0xe3b 3643
rsi 0xffffffff 4294967295
rdi 0x6 6
rbp 0x7f6730006480 0x7f6730006480
rsp 0x7f66a8ff4790 0x7f66a8ff4790
r8 0x0 0
r9 0x0 0
r10 0x7fa7a81ee580 140358056863104
r11 0x7f6730006530 140081163691312
r12 0x7 7
r13 0x17428d50 390237520
r14 0xfc00000000000000 -288230376151711744
r15 0x7f67a80d504a 140083177803850
rip 0x40c039 0x40c039 <probe_table((anonymous namespace)::Pos const*, int, int*, int)+921>
eflags 0x10297 [ CF PF AF SF IF RF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
│0x40c022 <_Z11probe_tablePKN12_GLOBAL__N_13PosEiPii+898> lea 0x1(%rdi),%r12d
│0x40c026 <_Z11probe_tablePKN12_GLOBAL__N_13PosEiPii+902> mov %rdi,0x28(%rsp)
│0x40c02b <_Z11probe_tablePKN12_GLOBAL__N_13PosEiPii+907> add 0x10(%rbp),%r10
│0x40c02f <_Z11probe_tablePKN12_GLOBAL__N_13PosEiPii+911> neg %rax
│0x40c032 <_Z11probe_tablePKN12_GLOBAL__N_13PosEiPii+914> movslq %r12d,%r12
│0x40c035 <_Z11probe_tablePKN12_GLOBAL__N_13PosEiPii+917> mov 0x38(%rbp),%r14
>│0x40c039 <_Z11probe_tablePKN12_GLOBAL__N_13PosEiPii+921> movbe (%r10),%rsi
│0x40c03e <_Z11probe_tablePKN12_GLOBAL__N_13PosEiPii+926> lea 0x0(%rbp,%rax,8),%rbx
│0x40c043 <_Z11probe_tablePKN12_GLOBAL__N_13PosEiPii+931> add $0x8,%r10
│0x40c047 <_Z11probe_tablePKN12_GLOBAL__N_13PosEiPii+935> nopw 0x0(%rax,%rax,1)
│0x40c050 <_Z11probe_tablePKN12_GLOBAL__N_13PosEiPii+944> cmp %rsi,%r14
│0x40c053 <_Z11probe_tablePKN12_GLOBAL__N_13PosEiPii+947> jbe 0x40c0ce <_Z11probe_tablePKN12_GLOBAL__N_13PosEiPii+1070>
It is important to note that the probe_table, probe_ab, probe_wdl
have not been implemented by us but is a Library used by basically any chess programm to read table-base files up to 7 pieces on the board.
A crash has never been observed by other programs regarding this library. That is why I conclude the input I have given to the probe_wdl
were wrong.
The getWDL(Board* board)
function looks like this:
Score getWDL(Board* board) {
UCI_ASSERT(board);
// we cannot prove the tables if there are too many pieces on the board
if (bitCount(*board->getOccupiedBB()) > (signed) TB_LARGEST)
return MAX_MATE_SCORE;
// use the given files to prove the tables using the information from the board.
unsigned res = tb_probe_wdl(
board->getTeamOccupiedBB()[WHITE],
board->getTeamOccupiedBB()[BLACK],
board->getPieceBB()[WHITE_KING] | board->getPieceBB()[BLACK_KING],
board->getPieceBB()[WHITE_QUEEN] | board->getPieceBB()[BLACK_QUEEN],
board->getPieceBB()[WHITE_ROOK] | board->getPieceBB()[BLACK_ROOK],
board->getPieceBB()[WHITE_BISHOP] | board->getPieceBB()[BLACK_BISHOP],
board->getPieceBB()[WHITE_KNIGHT] | board->getPieceBB()[BLACK_KNIGHT],
board->getPieceBB()[WHITE_PAWN] | board->getPieceBB()[BLACK_PAWN],
board->getCurrent50MoveRuleCount(),
board->getCastlingRights(0) |
board->getCastlingRights(1) |
board->getCastlingRights(2) |
board->getCastlingRights(3),
board->getEnPassantSquare() != 64 ? board->getEnPassantSquare() : 0,
board->getActivePlayer() == WHITE);
Beside the information above, we have gotten an incomplete coredump with a size of around 2 Gb. Note that the entire memory usage of the program was roughly 100Gb where most of the memory was used for indexing some hash table inside the search tree and is not relevant for debugging.
Since I have never worked with anything like this, I would be very happy if someone could help and explain me on how I could read and parse the core-dump to extract the information stored inside Board* board
to check if and how the Board-object has been altered.
Greetings Finn
I do have gdb on my machine together with the incomplete core-dump and the original linux-executable which has been compiled similar to this:
g++ -O3 -std=c++17 -Wall -Wextra -Wshadow -DNDEBUG -flto -march=native *.cpp syzygy/tbprobe.c -DMINOR_VERSION=50 -DMAJOR_VERSION=4 -pthread -Wl,--whole-archive -lpthread -Wl,--no-whole-archive -DUSE_POPCNT -msse3 -mpopcnt -o ../bin/Koivisto_4.44-x64-linux-native.exe
User contributions licensed under CC BY-SA 3.0