Numpy Memmap Ctypes Access

4

I'm trying to use a very large numpy array using numpy memmap, accessing each element as a ctypes Structure.

class My_Structure(Structure):
    _fields_ = [('field1', c_uint32, 3),
                ('field2', c_uint32, 2),
                ('field3', c_uint32, 2),
                ('field4', c_uint32, 9),
                ('field5', c_uint32, 12),
                ('field6', c_uint32, 2),
                ('field7', c_uint32, 2)]

    def __str__(self):
        return f'MyStruct -- f1{self.field1} f2{self.field2} f3{self.field3} f3{self.field4} f5{self.field5} f6{self.field6} f7{self.field7}'

    def __eq__(self, other):
        for field in self._fields_:
            if getattr(self, field[0]) != getattr(other, field[0]):
                return False
            return True

_big_array = np.memmap(filename = 'big_file.data',
                                   dtype = 'uint32',
                                   mode = 'w+',
                                   shape = 1000000
                                   )

big_array = _big_array.ctypes.data_as(ctypes.POINTER(My_Structure))

big_array[0].field1 = 5
...

And it seems to work correctly, but I'm getting an fault on a 64bit Windows machine where python.exe simply stops. In Event Viewer, I see that the faulting module name is _ctypes.pyd and the exception code is 0xc0000005 which I believe is an access exception.

I don't seem to be getting the same error on Linux, though my testing has not been thorough.

My questions are:

  1. Does my access look correct; ie. am I using numpy.memmap.ctypes.data_as correctly?

  2. Does the fact that I have functions (__str__ and __eq__) defined on My_Structure change its size? ie. can it still be used in the array as a uint32?

  3. Is there anything that you think might cause this behavior? Particularly considering the differences between Windows and Linux?

EDIT:

  1. Using ctypes.addressof and ctypes.sizeof on big_array elements, it looks like the __str__ and __eq__ do not impact the size of My_Structure

  2. I added some asserts before my access to big_array and found that I was attempting to access big_array[-1], which explains the access error and crash.

Which leaves question 1 open: It looks like my code is technically correct, but I'm wondering if there is a better way to access the numpy array than using a ctypes.pointer so that I still get the benefits of using a numpy array (out-of-bound access warning, negative index wrapping, etc.). Daniel below suggested using a structured numpy array, but is it possible to do bitfield access with this?

python
numpy
ctypes
numpy-memmap
asked on Stack Overflow Dec 14, 2017 by sheridp • edited Dec 14, 2017 by sheridp

1 Answer

0

You can cast to ctypes at the last step, not the first step:

_big_array[0, ...].ctypes.data_as(ctypes.POINTER(My_Structure)).field1 = 5

Note that ... is needed to keep the result as a 0d array, so that the .ctypes attribute exists

Now of course, negative indexing will work just fine:

_big_array[-1, ...].ctypes.data_as(ctypes.POINTER(My_Structure)).field1 = 5

Daniel below suggested using a structured numpy array, but is it possible to do bitfield access with this?

No

answered on Stack Overflow Nov 19, 2018 by Eric

User contributions licensed under CC BY-SA 3.0