Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python libclang bindings on Windows fail to initialize a translation unit from sublime text

Short description: using libclang to autocomplete code does not work with python that comes bundled with Sublime Text 3.

Details: A small verifiable example is in the repo on Github

In essence, there is a script that uses a slightly changed cindex.py (compatible with python 3 and clang 3.8) and builds a Translation Unit from a test source file. It then reparses it and tries to complete.

The script works as expected on using Python 3.3.5 from Powershell.

When put into Packages folder on Sublime Text 3 it produces an error. Python version as reported by Sublime Text 3 is 3.3.6. The error:

Traceback (most recent call last):
  File "C:\Program Files\Sublime Text 3\sublime_plugin.py", line 78, in reload_plugin
    m = importlib.import_module(modulename)
  File "./python3.3/importlib/__init__.py", line 90, in import_module
  File "<frozen importlib._bootstrap>", line 1584, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1565, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1532, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 584, in _check_name_wrapper
  File "<frozen importlib._bootstrap>", line 1022, in load_module
  File "<frozen importlib._bootstrap>", line 1003, in load_module
  File "<frozen importlib._bootstrap>", line 560, in module_for_loader_wrapper
  File "<frozen importlib._bootstrap>", line 868, in _load_module
  File "<frozen importlib._bootstrap>", line 313, in _call_with_frames_removed
  File "C:\Users\igor\AppData\Roaming\Sublime Text 3\Packages\test_clang\script.py", line 21, in <module>
    tu = TU.from_source(filename=filename)
  File "C:\Users\igor\AppData\Roaming\Sublime Text 3\Packages\test_clang\clang\cindex38.py", line 2372, in from_source
    raise TranslationUnitLoadError("Error parsing translation unit.")
clang.cindex38.TranslationUnitLoadError: Error parsing translation unit.

This is happening because the ptr to tranlation unit returned by libclang inside cindex.py is None. The only thing that is strange for me is that it happens only with python bundled with sublime text 3.

Does it happen also to other people? Does anyone have any idea what could be the cause or how to debug it?

Also feel free to ping me if you cannot run the example provided here.

UPD: in the issues of the test project we have found out that it is not ctypes that is bundled in sublime text. Replacing the ones from sublime text to the ones installed in the system produces the same error.

UPD2: I have stripped down the cindex.py file in the test repository to only contain a bare minimum of code that is needed to run into the same issue as described in the question. Maybe this will help generating new ideas on what can be wrong? Also, I want to explicitly point out that the same code works exactly as expected on both Linux and OSX.

like image 426
niosus Avatar asked Jun 12 '16 13:06

niosus


1 Answers

The debugability about this failure is lacking a great deal it turns out. I traced a bunch of the libclang binding source hoping to find a workaround for the absence of debugability surrounding TranslationUnitLoadError's thrown from python.

There seems to be some fundamental limitations here even if you use an ctypes errcheck callback like the following...

# MODIFIED cindex.py

def errcheck_callable(result, func, arguments):
    print(f"ERROR---result={result}, func={func}, arguments={arguments}")
    import pdb
    pdb.set_trace()

functionList = [
    ...
    ("clang_parseTranslationUnit",
    [Index, c_interop_string, c_void_p, c_int, c_void_p, c_int, c_int],
    c_object_p,
    errcheck_callable, # <--- makes this dll function a bit more debugable during call to `register_function`
    ),
    ...
]

There's not much content in the error callback triggered during a translation unit failure however:

> /Users/USERX/python3.7/site-packages/clang/cindex.py(155)errcheck_callable()
-> print(f"ERROR---result={result}, func={func}, arguments={arguments}")
ERROR---result=<clang.cindex.LP_c_void_p object at 0x10b1aa9d8>, func=<_FuncPtr object at 0x10b1bf5c0>, arguments=(<clang.cindex.Index object at 0x10aea1e48>, None, <clang.cindex.c_char_p_Array_62 object at 0x10b1aa620>, 62, None, 0, 0)
(Pdb) result.contents
*** ValueError: NULL pointer access

There's a pending FIXME comment added several years back regarding the debugability gaps inherent to libclang's clang_parseTranslationUnit.

# FIXME: Make libclang expose additional error information in this scenario.

A bit of discussion takes place here on this post Dealing with parse errors with Python bindings of libclang . The best suggestion there appears to come from the idea to attach a debugger to libclang:

...you might be able to get a debugger to break on clang_parseTranslationUnit and inspect the error state there.



To dive into a bit of the internals basically libclang is loaded into python via the ctypes call to cdll.LoadLibrary to create a CDLL instance. Then a hardcoded set of functions defined in functionList as a set of tuples are all registered via register_functions to give them a deeper python presence. The actual TranslationUnitLoadError gets raised within the classmethod TranslationUnit.from_source which makes a direct call to libclang function in the line

ptr = conf.lib.clang_createTranslationUnit(index, fspath(filename))

I believe it's here where the debugability gets truncated, because the underlying source for the python bindings is C, not C++, so there's no exception handling to bubble up like an SEHException for .net would. With that runtime you could debug unmanaged code. No equivalent here however.

You can trace the translation unit variable TU down the call stack from its source...

CXTranslationUnit
clang_parseTranslationUnit(CXIndex CIdx,
                        const char *source_filename,
                        const char *const *command_line_args,
                        int num_command_line_args,
                        struct CXUnsavedFile *unsaved_files,
                        unsigned num_unsaved_files,
                        unsigned options) {
CXTranslationUnit TU;
enum CXErrorCode Result = clang_parseTranslationUnit2(
    CIdx, source_filename, command_line_args, num_command_line_args,
    unsaved_files, num_unsaved_files, options, &TU);

I'll update this answer if I discover anything more substantive. Given this debugability gap it may be more fruitful to do libclang analysis straight from C++ like this fella does, or by using the command line tool clang-query like detailed here

like image 92
jxramos Avatar answered Oct 01 '22 15:10

jxramos