-
Notifications
You must be signed in to change notification settings - Fork 36
Debugging tips
-
Debugging on Linux
- Cython provides a cython-aware gdb frontend, cygdb:
- However, gdb/cygdb are only practically useful on Linux, because gdb does not work well on newer versions of macOS.
-
Debugging on macOS
-
It is reasonably practical to single-step debug small sections of the Cython-generated C++ code. Some familiarity with the CPython object model is very helpful here.
-
The Cython option
Cython.Compiler.Options.``emit_code_commentscontrols whether Cython emits a copy of the source code into the output C++ file; this is on by default and should be enabled for debugging. Each line of C++ code will be preceded by a commented-out version of the source Cython code. -
Each context block in the generated C++ will have the corresponding line number in the original Cython code. So, start from a Cython line number, find that block, and set a breakpoint at the line below the context comment in the generated
libtiledb.cpp. -
In order to see all of the python code corresponding to C++ code while single-stepping, it is recommended to increase the lldb code-listing verbosity:
(lldb) settings set stop-line-count-before 8 -
Start the python interpreter under lldb and run a command which will invoke the targeted section of Cython/C++ code.
- or run a script (potentially w/ args). Assuming LINENO in
libtiledb.cppas per above:
$ lldb -- python -i MYSCRIPT.py (lldb) b libtiledb.cpp:LINENO >>> import tiledb >>> [run command to trigger breakpoint, then step, view values, etc.]-
To print Cython
PyObject*variables in the debugger, install the following LLDB script: https://github.com/malor/cpython-lldb -
Then, within a
libtiledb.cppframe:- individual
PyObject*variables should pretty-print withp, for example:p __pyx_v_uri - the LLDB command
frame variablewill show known variables in the frame
- individual
- or run a script (potentially w/ args). Assuming LINENO in
-
- Ideally, the Cython code will have primitive types which can be printed with the usual lldbp(rint)command. However, to print the contents of aPyObject*inside the debugger, see the following discussion; these commands may be called in the debugger:- https://stackoverflow.com/questions/5356773/python-get-string-representation-of-pyobject
Misc debugging
-
Given a memory address, ADDR,
ctypesmay be used to read value(s) from that address:>>> import ctypes >>> p = ctypes.cast(ADDR, ctypes.POINTER(ctypes.c_uint64)) >>> p[0], p[1] ^ equivalent to *p *(p+1) etc. -
Defining the following function will allow most tests to be copy-pasted into the REPL from
test_libtiledb.py, and run directly:>>> import tiledb, numpy as np >>> self = lambda: None; self.path = lambda x: os.path.join("/tmp", x) >>> [paste non-indented test block, and run] -
Debugging on macOS with gdb (note: does not currently work):
TileDB-Py's setup.py supports a command line argument --modular which enables a modular build. By default, code in separate .pyx files is sourced into the main libtiledb.pyx file using the Cython include command. When setup.py is run with --modular, the Cython compile-time constant TILEDBPY_MODULAR is set to True, and all files listed in MODULAR_SOURCES within setup.py are built as separate Cython modules (initially the only modular file is np2buf.pyx). When TILEDBPY_MODULAR is set, import is used to make the necessary function definitions available in libtiledb.pyx. The goal of this mechanism is to reduce the compilation time by limiting the size of the pyx file. For more details and usage example, see the following commits:
- Modularization: https://github.com/TileDB-Inc/TileDB-Py/commit/11dcba6d1dc49f72c604fc49ab225f85983f9c78
- Usage: https://github.com/TileDB-Inc/TileDB-Py/commit/a898f7e7f58760a923cfc694e409f0fda46a9a61
Given a function (in pure python) which creates a DenseArray:
def foo():
arr = tiledb.DenseArray(...)
import pdb; pdb.set_trace()
Entering pdb at this point, we can print out the array:
(Pdb) p arr
<tiledb.libtiledb.DenseArray object at 0x000000123456789>
Copy the address!
Now, set a breakpoint (or repeat pdb.set_trace()) in a location where we expect the refcount of
arr to be zero -- for example, some location after the function return. At that point we can check the refcount and referrers as follows:
(Pdb) import ctypes, sys
(Pdb) o = ctypes.cast(0x000000123456789, ctypes.py_object)
(Pdb) o
py_object(<tiledb.libtiledb.DenseArray object at 0x000000123456789>)
(Pdb) sys.getrefcount(o.value)
?
(Pdb) gc.get_referrers(o.value)
[...]
(note that ctypes.cast(<addr>, ctypes.py_object) does not increase the refcount of the target object -- which can be verified by assigning a second variable to the identical ctypes.cast call.