Disassemble instruction set for 8051 microcontroller

1

I have the following hex opcode sequence for a 8051 microcontroller

785679107A247BFD7C347D407E51745568F869F96AFA6BFB6CFC6DFD6EFE

I found this repo that converts hex to instruction sequences https://github.com/anarcheuz/8051-disassembler.

Using that I was able to get the following assembly instructions

x00000000:      37 38        MOV 38 (R0,#immed)
0x00000002:     35           ANL A,@R0
0x00000004:     37 39        MOV 39 (R1,#immed)
0x00000006:     31 30 37     JBC 3037 (bit,offset)
0x00000008:     37 41        MOV 41 (R2,#immed)
0x0000000a:     32 34        ADD 34 (A,#immed)
0x0000000c:     37 42        MOV 42 (R3,#immed)
0x0000000e:     46           MOV R5,A
0x00000010:     37 43        MOV 43 (R4,#immed)
0x00000012:     33 34        ADDC 34 (A,#immed)
0x00000014:     37 44        MOV 44 (R5,#immed)
0x00000016:     34 30        JC 30 (offset)
0x00000018:     37 45        MOV 45 (R6,#immed)
0x0000001a:     35 31        ACALL 31 (addr11)
0x0000001c:     37 34        MOV 34 (A,#immed)
0x0000001e:     35 35        ANL 35 (A,direct)
0x00000020:     36           XRL A,R0
0x00000022:     46           MOV R0,A
0x00000024:     36           XRL A,R1
0x00000026:     46           MOV R1,A
0x00000028:     36           XRL A,R2
0x0000002a:     46           MOV R2,A
0x0000002c:     36           XRL A,R3
0x0000002e:     46           MOV R3,A
0x00000030:     36           XRL A,R4
0x00000032:     46           MOV R4,A
0x00000034:     36           XRL A,R5
0x00000036:     46           MOV R5,A
0x00000038:     36           XRL A,R6
0x0000003a:     46           MOV R6,A

On wikipedia there's an explanation what the operations mean https://en.wikipedia.org/wiki/Intel_MCS-51, but since I haven't worked with assembly nor microcontrollers before it was difficult to follow.

Does someone know what the workflow is and what the values in the different registers are at the end?

assembly
reverse-engineering
microcontroller
disassembly
8051
asked on Stack Overflow Jun 5, 2020 by wasp256 • edited Jun 5, 2020 by Peter Cordes

3 Answers

2

Easier/faster to just do it by hand, just look at an 8051 instruction set reference.

78 56  mov r0,#0x56
79 10  mov r1,#0x10
7A 24  mov r2,#0x24
7B FD  mov r3,#0xFD
7C 34  mov r4,#0x34
7D 40  mov r5,#0x40
7E 51  mov r6,#0x51
74 55  A,#0x55
68     XRL A,R0

You can spend another five minutes on it and finish the rest.

answered on Stack Overflow Jun 5, 2020 by old_timer
1

That looks like disassembly of the string of ASCII characters, not the binary values they represent! Notice that the middle column (the machine code) is all 0x30..46, i.e. ASCII codes for '0' to 'F'.

e.g. the first 2 bytes you disassembled are 37 38, which are the ASCII codes for '7' and '8', but what you want is a single 78 byte.

You need to hex un-dump into binary before feeding it to that disassembler.

answered on Stack Overflow Jun 5, 2020 by Peter Cordes • edited Jun 5, 2020 by Peter Cordes
1

Enters radare2 (or its rizin fork, but you'll have to adjust the binary names).

$ rax2 -s 785679107A247BFD7C347D407E51745568F869F96AFA6BFB6CFC6DFD6EFE
xVyz${�|4}@~QtUh�i�j�k�l�m�n�

$ rasm2 -a 8051 -d 785679107A247BFD7C347D407E51745568F869F96AFA6BFB6CFC6DFD6EFE
mov r0, #0x56
mov r1, #0x10
mov r2, #0x24
mov r3, #0xfd
mov r4, #0x34
mov r5, #0x40
mov r6, #0x51
mov a, #0x55
xrl a, r0
mov r0, a
xrl a, r1
mov r1, a
xrl a, r2
mov r2, a
xrl a, r3
mov r3, a
xrl a, r4
mov r4, a
xrl a, r5
mov r5, a
xrl a, r6
mov r6, a

As to the workflow and the final register values, given the instructions, pen and paper is a first learning step. It is overkill here, but for larger code, you can turn to emulation (it seems unicorn doesn't support 8051, a search engine will let you know of alternatives).

answered on Stack Overflow Jan 27, 2021 by slv

User contributions licensed under CC BY-SA 3.0