How to bruteforce a lossy AND routine?

Question

How to bruteforce a lossy AND routine?

Im wondering whether there are any standard approaches to reversing AND routines by brute force. For example I have the following transformation:

MOV(eax, 0x5b3e0be0)  <- Here we move 0x5b3e0be0 to EDX.
MOV(edx, eax)  # Here we copy 0x5b3e0be0 to EAX as well.
SHL(edx, 0x7)  # Bitshift 0x5b3e0be0 with 0x7 which results in 0x9f05f000
AND(edx, 0x9d2c5680)  # AND 0x9f05f000 with 0x9d2c5680 which results in 0x9d045000
XOR(edx, eax)  # XOR 0x9d045000 with original value 0x5b3e0be0 which results in 0xc63a5be0

My question is how to brute force and reverse this routine (i.e. transform 0xc63a5be0 back into 0x5b3e0be0)

One idea i had (which didn't work) was this using PeachPy implementation:

#Input values
MOV(esi, 0xffffffff) < Initial value to AND with, which will be decreased by 1 in a loop.
MOV(cl, 0x1) < Initial value to SHR with which will be increased by 1 until 0x1f.
MOV(eax, 0xc63a5be0) < Target result which I'm looking to get using the below loop.
MOV(edx, 0x5b3e0be0) < Input value which will be transformed.

sub_esi = peachpy.x86_64.Label()
with loop:
    #End the loop if ESI = 0x0
    TEST(esi, esi)
    JZ(loop.end)
    #Test the routine and check if it matches end result.
    MOV(ebx, eax)
    SHR(ebx, cl)
    TEST(ebx, ebx)
    JZ(sub_esi)
    AND(ebx, esi)
    XOR(ebx, eax)
    CMP(ebx, edx)
    JZ(loop.end)
    #Add to the CL register which is used for SHR.
    #Also check if we've reached the last potential value of CL which is 0x1f
    ADD(cl, 0x1)
    CMP(cl, 0x1f)
    JNZ(loop.begin)

    #Decrement ESI by 1, reset CL and restart routine.
    peachpy.x86_64.LABEL(sub_esi)
    SUB(esi, 0x1)
    MOV(cl, 0x1)
    JMP(loop.begin)

#The ESI result here will either be 0x0 or a valid value to AND with and get the necessary result.
RETURN(esi)

Maybe an article or a book you can recommend specific to this?

math

assembly

x86

reverse-engineering

asked on Stack Overflow Aug 19, 2020 by

Involar • edited Aug 21, 2020 by

Peter Cordes

2 Answers

It's not lossy, the final operation is an XOR.
The whole routine can be modeled in C as

#define K 0x9d2c5680
uint32_t hash(uint32_t num)
{
  return num ^ ( (num << 7) & K);
}

Now, if we have two bits x and y and the operation x XOR y, when y is zero the result is x.
So given two numbers n1 and n2 and considering their XOR, the bits or n1 that pairs with a zero in n2 would make it to the result unchanged (the others will be flipped).

So in considering num ^ ( (num << 7) & K) we can identify num with n1 and (num << 7) & K with n2.
Since n2 is an AND, we can tell that it must have at least the same zero bits that K has.
This means that each bit of num that corresponds to a zero bit in the constant K will make it unchanged into the result.
Thus, by extracting those bits from the result we already have a partial inverse function:

/*hash & ~K extracts the bits of hash that pair with a zero bit in K*/
partial_num = hash & ~K

Technically, the factor num << 7 would also introduce other zeros in the result of the AND. We know for sure that the lowest 7 bits must be zero.
However K already has the lowest 7 bits zero, so we cannot exploit this information.
So we will just use K here, but if its value were different you'd need to consider the AND (which, in practice, means to zero the lower 7 bits of K).

This leaves us with 13 bits unknown (the ones corresponding to the bits that are set in K). If we forget about the AND for a moment, we would have x ^ (x << 7) meaning that

h_i = num_i for i from 0 to 6 inclusive
h_i = num_i ^ num_i-7 for i from 7 to 31 inclusive
(The first line is due to the fact that the lower 7 bits of the right-hand are zero)

From this, starting from h₇ and going up, we can retrive num₇ as h₇ ^ num₀ = h₇ ^ h₀.
From bit 7 onward, the equality doesn't work and we need to use num_k (for the suitable k) but luckily we already have computed its value in a previous step (that's why we start from lower to higher).

What the AND does to this is just restricting the values the index i runs in, specifically only to the bits that are set in K.

So to fill in the thirteen remaining bits one have to do:

part_num₇ = h₇ ^ part_num₀
part_num₉ = h₉ ^ part_num₂
part_num₁₂ = h₁₂ ^ part_num₅
...
part_num₃₁ = h₃₁ ^ part_num₂₄

Note that we exploited that fact that part_num_0..6 = h_0..6.

Here's a C program that inverts the function:

#include <stdio.h>
#include <stdint.h>


#define BIT(i, hash, result) ( (((result >> i) ^ (hash >> (i+7))) & 0x1) << (i+7) )
#define K 0x9d2c5680

uint32_t base_candidate(uint32_t hash)
{
  uint32_t result = hash & ~K;

  result |= BIT(0, hash, result);
  result |= BIT(2, hash, result);
  result |= BIT(3, hash, result);
  result |= BIT(5, hash, result);
  result |= BIT(7, hash, result);
  result |= BIT(11, hash, result);
  result |= BIT(12, hash, result);
  result |= BIT(14, hash, result);
  result |= BIT(17, hash, result);
  result |= BIT(19, hash, result);
  result |= BIT(20, hash, result);
  result |= BIT(21, hash, result);
  result |= BIT(24, hash, result);

  return result;
}

uint32_t hash(uint32_t num)
{
  return num ^ ( (num << 7) & K);
}



int main()
{

  uint32_t tester = 0x5b3e0be0;
  uint32_t candidate = base_candidate(hash(tester));

  printf("candidate: %x, tester %x\n", candidate, tester);
  
  return 0;

}

answered on Stack Overflow Aug 19, 2020 by

Margaret Bloom

Since the original question was how to "bruteforce" instead of solve here's something that I eventually came up with which works just as well. Obviously its prone to errors depending on input (might be more than 1 result).

from peachpy import *
from peachpy.x86_64 import *

input = 0xc63a5be0

x = Argument(uint32_t)
with Function("DotProduct", (x,), uint32_t) as asm_function:
    LOAD.ARGUMENT(edx, x)  # EDX = 1b6fb67c
    MOV(esi, 0xffffffff)

    with Loop() as loop:
        TEST(esi,esi)
        JZ(loop.end)
        MOV(eax, esi)
        SHL(eax, 0x7)
        AND(eax, 0x9d2c5680)
        XOR(eax, esi)
        CMP(eax, edx)
        JZ(loop.end)
        SUB(esi, 0x1)
        JMP(loop.begin)
    RETURN(esi)

#Read Assembler Return
abi = peachpy.x86_64.abi.detect()
encoded_function = asm_function.finalize(abi).encode()
python_function = encoded_function.load()
print(hex(python_function(input)))

answered on Stack Overflow Aug 21, 2020 by

Involar • edited Aug 21, 2020 by

Involar

User contributions licensed under CC BY-SA 3.0