How to jump to / call arbitrary memory in Rust

7

I am experimenting with building a JIT that executes x86 instructions that the program produces. I think I have created a valid slice of x86 byte code that should print "Hello World", but I am not sure how to call it.

I am casting a pointer to the start of a vector to a void function and calling it:

fn main() {
    let msg: &[u8] = b"Hello World\0";

    let mut byte_codes: Vec<u8> = Vec::with_capacity(1000);

    // Move width into edx
    byte_codes.extend_from_slice(&[0xba, msg.len() as u8, 0, 0, 0]);
    
    // Msg to write
    byte_codes.push(0xb9);
    byte_codes.extend_from_slice(&(msg.as_ptr() as u64).to_be_bytes());
    
    // File descriptor and sys call
    byte_codes.extend_from_slice(&[0xbb, 0x01, 0, 0, 0]);
    byte_codes.extend_from_slice(&[0xb8, 0x04, 0, 0, 0]);
    
    // Sys call
    byte_codes.extend_from_slice(&[0xcd, 0x80]);

    // Return
    byte_codes.push(0xc3); 

    let func_ptr = byte_codes.as_ptr();
    unsafe {
        let func: fn() -> () = func_ptr.cast::<fn() -> ()>().read();
        func();
    }
}

Executing this returns:

error: process didn't exit successfully: `target\debug\run-bytecode.exe` (exit code: 0xc0000005, STATUS_ACCESS_VIOLATION)

Removing all bytecode except the return call also leads to the same error.

I'm not sure what that error means. Is there a issue with the bytecode or is my function casting incorrect? How can I get it printing "Hello World"?

assembly
rust
x86
jit
asked on Stack Overflow Feb 17, 2021 by kaleidawave • edited Feb 17, 2021 by Shepmaster

1 Answer

9

Here's a version that works:

use memmap::MmapMut;

fn main() {
    let msg: &[u8] = b"Hello World\0";

    let mut byte_codes: Vec<u8> = Vec::with_capacity(1000);

    // Move width into edx
    byte_codes.extend_from_slice(&[0xba, msg.len() as u8, 0, 0, 0]);

    // Msg to write
    byte_codes.push(0xb9);
    byte_codes.extend_from_slice(&(msg.as_ptr() as u32).to_le_bytes());

    // File descriptor and sys call
    byte_codes.extend_from_slice(&[0xbb, 0x01, 0, 0, 0]);
    byte_codes.extend_from_slice(&[0xb8, 0x04, 0, 0, 0]);

    // Sys call
    byte_codes.extend_from_slice(&[0xcd, 0x80]);

    // Return
    byte_codes.push(0xc3);

    let mut m = MmapMut::map_anon(byte_codes.len()).unwrap();
    m.clone_from_slice(&byte_codes);
    let m = m.make_exec().unwrap();
    let func_ptr = m.as_ptr();
    unsafe {
        let func: extern "C" fn() = std::mem::transmute(func_ptr);
        func();
    }
}

A couple of things needed to be fixed:

  1. It looks like byte_codes is 32-bit x86 Linux code, so it will need to be run with something like cargo run --target i686-unknown-linux-gnu
  2. Because it's 32-bit code, we want to cast msg.as_ptr() to u32.
  3. x86 is little-endian so we want to use .to_le_bytes()
  4. func_ptr.cast::<fn() -> ()>().read() doesn't cast to a function pointer, it casts the first 4/8 bytes of byte_codes to a function pointer.
  5. Using extern "C" fn() makes sure Rust knows about the proper ABI
  6. We use the memmap crate to create memory that we can mark as executable with make_exec().
answered on Stack Overflow Feb 17, 2021 by Jeff Muizelaar • edited Feb 18, 2021 by vallentin

User contributions licensed under CC BY-SA 3.0