Building Self-Referencing Tuples

18

After seeing a conversation in a forum from many years ago that was never resolved, it caused me to wonder how one would correctly create a tuple that referenced itself. Technically, this is a very bad idea since tuples are supposed to be immutable. How could an immutable object possibly contain itself? However, this question is not about best practices but is a query regarding what is possible in Python.

import ctypes

def self_reference(array, index):
    if not isinstance(array, tuple):
        raise TypeError('array must be a tuple')
    if not isinstance(index, int):
        raise TypeError('index must be an int')
    if not 0 <= index < len(array):
        raise ValueError('index is out of range')
    address = id(array)
    obj_refcnt = ctypes.cast(address, ctypes.POINTER(ctypes.c_ssize_t))
    obj_refcnt.contents.value += 1
    if ctypes.cdll.python32.PyTuple_SetItem(ctypes.py_object(array),
                                            ctypes.c_ssize_t(index),
                                            ctypes.py_object(array)):
        raise RuntimeError('PyTuple_SetItem signaled an error')

The previous function was designed to access the C API of Python while keeping internal structures and datatypes in mind. However, the following error is usually generated when running the function. Through unknown processes, it has been possible to create a self-referencing tuple via similar techniques before.

Question: How should the function self_reference be modified to consistently work all of the time?

>>> import string
>>> a = tuple(string.ascii_lowercase)
>>> self_reference(a, 2)
Traceback (most recent call last):
  File "<pyshell#56>", line 1, in <module>
    self_reference(a, 2)
  File "C:/Users/schappell/Downloads/srt.py", line 15, in self_reference
    ctypes.py_object(array)):
WindowsError: exception: access violation reading 0x0000003C
>>> 

Edit: Here are two different conversations with the interpreter that are somewhat confusing. The code up above appears to be correct if I understand the documentation correctly. However, the conversations down below appear to both conflict with each other and the self_reference function up above.

Conversation 1:

Python 3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit (Intel)]
on win32
Type "copyright", "credits" or "license()" for more information.
>>> from ctypes import *
>>> array = tuple(range(10))
>>> cast(id(array), POINTER(c_ssize_t)).contents.value
1
>>> cast(id(array), POINTER(c_ssize_t)).contents.value += 1
>>> cast(id(array), POINTER(c_ssize_t)).contents.value
2
>>> array
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
>>> cdll.python32.PyTuple_SetItem(c_void_p(id(array)), 0,
                                  c_void_p(id(array)))
Traceback (most recent call last):
  File "<pyshell#6>", line 1, in <module>
    cdll.python32.PyTuple_SetItem(c_void_p(id(array)), 0,
                                  c_void_p(id(array)))
WindowsError: exception: access violation reading 0x0000003C
>>> cdll.python32.PyTuple_SetItem(c_void_p(id(array)), 0,
                                  c_void_p(id(array)))
Traceback (most recent call last):
  File "<pyshell#7>", line 1, in <module>
    cdll.python32.PyTuple_SetItem(c_void_p(id(array)), 0,
                                  c_void_p(id(array)))
WindowsError: exception: access violation reading 0x0000003C
>>> array
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
>>> cdll.python32.PyTuple_SetItem(c_void_p(id(array)), 0,
                                  c_void_p(id(array)))
0
>>> array
((<NULL>, <code object __init__ at 0x02E68C50, file "C:\Python32\lib
kinter\simpledialog.py", line 121>, <code object destroy at 0x02E68CF0,
file "C:\Python32\lib   kinter\simpledialog.py", line 171>, <code object
body at 0x02E68D90, file "C:\Python32\lib      kinter\simpledialog.py",
line 179>, <code object buttonbox at 0x02E68E30, file "C:\Python32\lib
kinter\simpledialog.py", line 188>, <code object ok at 0x02E68ED0, file
"C:\Python32\lib        kinter\simpledialog.py", line 209>, <code object
cancel at 0x02E68F70, file "C:\Python32\lib    kinter\simpledialog.py",
line 223>, <code object validate at 0x02E6F070, file "C:\Python32\lib
kinter\simpledialog.py", line 233>, <code object apply at 0x02E6F110, file
"C:\Python32\lib     kinter\simpledialog.py", line 242>, None), 1, 2, 3, 4,
5, 6, 7, 8, 9)
>>>

Conversation 2:

Python 3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit (Intel)]
on win32
Type "copyright", "credits" or "license()" for more information.
>>> from ctypes import *
>>> array = tuple(range(10))
>>> cdll.python32.PyTuple_SetItem(c_void_p(id(array)), c_ssize_t(1),
                                  c_void_p(id(array)))
0
>>> array
(0, (...), 2, 3, 4, 5, 6, 7, 8, 9)
>>> array[1] is array
True
>>>
python
tuples
ctypes
creation
self-reference
asked on Stack Overflow Aug 8, 2012 by Noctis Skytower • edited Jan 13, 2015 by Noctis Skytower

5 Answers

9

Thanks to nneonneo's help, I settled on the following implementation of the self_reference method.

import ctypes

ob_refcnt_p = ctypes.POINTER(ctypes.c_ssize_t)

class GIL:
    acquire = staticmethod(ctypes.pythonapi.PyGILState_Ensure)
    release = staticmethod(ctypes.pythonapi.PyGILState_Release)

class Ref:
    dec = staticmethod(ctypes.pythonapi.Py_DecRef)
    inc = staticmethod(ctypes.pythonapi.Py_IncRef)

class Tuple:
    setitem = staticmethod(ctypes.pythonapi.PyTuple_SetItem)
    @classmethod
    def self_reference(cls, array, index):
        if not isinstance(array, tuple):
            raise TypeError('array must be a tuple')
        if not isinstance(index, int):
            raise TypeError('index must be an int')
        if not 0 <= index < len(array):
            raise ValueError('index is out of range')
        GIL.acquire()
        try:
            obj = ctypes.py_object(array)
            ob_refcnt = ctypes.cast(id(array), ob_refcnt_p).contents.value
            for _ in range(ob_refcnt - 1):
                Ref.dec(obj)
            if cls.setitem(obj, ctypes.c_ssize_t(index), obj):
                raise SystemError('PyTuple_SetItem was not successful')
            for _ in range(ob_refcnt):
                Ref.inc(obj)
        finally:
            GIL.release()

To use the method, follow the example shown down below for creating your own self-referencing tuples.

>>> array = tuple(range(5))
>>> Tuple.self_reference(array, 1)
>>> array
(0, (...), 2, 3, 4)
>>> Tuple.self_reference(array, 3)
>>> array
(0, (...), 2, (...), 4)
>>> 
answered on Stack Overflow Aug 23, 2012 by Noctis Skytower • edited Aug 23, 2012 by Noctis Skytower
7

AFAICT, the reason you are seeing problems is because PyTuple_SetItem fails if the refcount of the tuple isn't exactly one. This is to prevent the function from being used if the tuple has already been used elsewhere. I'm not sure why you get an access violation from that, but it may be because the exception thrown by PyTuple_SetItem isn't properly dealt with. Furthermore, the reason why the array seems to mutate to some other object is because PyTuple_SetItem DECREF's the tuple on each failure; after two failures, the refcount is zero so the object is freed (and some other object apparently ends up in the same memory location).

Using the pythonapi object in ctypes is the preferred way to get access to the Python DLL, as it handles Python exceptions properly and is guaranteed to use the correct calling convention.

I don't have a Windows machine handy to test this out, but the following works fine on Mac OS X (both Python 2.7.3 and 3.2.2):

import ctypes

def self_reference(array, index):
    # Sanity check. We can't let PyTuple_SetItem fail, or it will Py_DECREF
    # the object and destroy it.
    if not isinstance(array, tuple):
        raise TypeError("array must be a tuple")

    if not 0 <= index < len(array):
        raise IndexError("tuple assignment index out of range")

    arrayobj = ctypes.py_object(array)

    # Need to drop the refcount to 1 in order to use PyTuple_SetItem.
    # Needless to say, this is incredibly dangerous.
    refcnt = ctypes.pythonapi.Py_DecRef(arrayobj)
    for i in range(refcnt-1):
        ctypes.pythonapi.Py_DecRef(arrayobj)

    try:
        ret = ctypes.pythonapi.PyTuple_SetItem(arrayobj, ctypes.c_ssize_t(index), arrayobj)
        if ret != 0:
            raise RuntimeError("PyTuple_SetItem failed")
    except:
        raise SystemError("FATAL: PyTuple_SetItem failed: tuple probably unusable")

    # Restore refcount and add one more for the new self-reference
    for i in range(refcnt+1):
        ctypes.pythonapi.Py_IncRef(arrayobj)

Result:

>>> x = (1,2,3,4,5)
>>> self_reference(x, 1)
>>> import pprint
>>> pprint.pprint(x)
(1, <Recursion on tuple with id=4299516720>, 3, 4, 5)
answered on Stack Overflow Aug 22, 2012 by nneonneo
3

More simple solution:

import ctypes
tup = (0,)
ctypes.c_longlong.from_address(id(tup)+24).value = id(tup)

Result:

>>> tup
((...),)
>>> type(tup)
tuple
>>> tup[0] is tup
True
answered on Stack Overflow Apr 17, 2020 by wim
2

Technically, you could wrap the reference to the tuple inside a mutable object.

>>> c = ([],)
>>> c[0].append(c)
>>> c
([(...)],)
>>> c[0]
[([...],)]
>>> 
answered on Stack Overflow Aug 22, 2012 by Nick ODell
2

Immutability should not prevent an object from referencing itself. This is easy to do in Haskell because it has lazy evaluation. Here is an imitation that does that by using a thunk:

>>> def self_ref_tuple():
    a = (1, 2, lambda: a)
    return a

>>> ft = self_ref_tuple()
>>> ft
(1, 2, <function <lambda> at 0x02A7C330>)
>>> ft[2]()
(1, 2, <function <lambda> at 0x02A7C330>)
>>> ft[2]() is ft
True

This is not a complete answer, just preliminary. Am working out to see if there's another way to make this possible.

answered on Stack Overflow Aug 22, 2012 by Claudiu

User contributions licensed under CC BY-SA 3.0