Mixed-mode debugging troubles - Feedback on PTVS2.0

Jan 27 at 5:23 PM
Edited Jan 27 at 5:25 PM
Hello everyone,

Following my update of dev env from VS 2010 to 2012, I've been able to test the newest features of PTVS 2.0, and I wanted
to give you some feedback about it, and also ask your help to maybe make the mixed-mode debugging finally work...

Let me first describe my (kind of specific) configurations : we are using a custom python interpreter (embedded - executable written in C++), which loads at initialization a lot of extension modules (core features, written in C++ and wrapped using CPython API). As the regular python interpreter, this custom interpreter can (among other things) execute regular python scripts, which may contain Python calls to the loaded extensions modules. The script execution is performed using a PyRun_SimpleFile() call, passing the absolute path of the script (I did a bunch of research on
the issues about PTVS mixed-mode debugging). We also have a second possible configuration, where all extension modules are contained in a "meta" module (*.pyd dll file) which can be loaded using a (patched) stock python interpreter (via "import" instruction)

I'm working on GUI stuff which consists of python scripts (quite a lot) passed to the custom interpreter (or patched python interpreter), doing regular python stuff and sometimes making calls to wrapped C++ features for heavy operations (large file reading, calls to 3D library, ...).

With PTVS 2.0 (also tested november build) and VS2012, I've been able to successfully, in the two "configurations" described above :
  • Do pure Python debugging : to do so, I had to patch :
    • visualstudio_py_launcher.py : our custom interpreter is setting some things in sys.path and sys.argv
    • visualstudio_py_util.py : some python objects of our 3D (C++ - Python-wrapped) graphical library (VTK) do not have a repr
      The feature is very handy, I can set breakpoints and inspect everything. But Step Into, Step Over, ... are all reaching the following breakpoint instead of the next statement, probably because I'm always inside a mainloop waiting for user events (basics of GUI). Anyway, pure Python debugger is a real time-saver because regular pdb debugger won't work inside this mainloop.
  • Remote debugging in Pure Python
  • Do performance analysis
However, I haven't been able to perform mixed-mode debugging. The problem is that the breakpoints in the Python layers are not hit, while they are in C++ layer (extensions). I first had a problem because when setting a breakpoint in Python, the "The breakpoint will not be hit. Symbols not loaded for this document" message was displayed and the breakpoint was not "lighted". I found that PTVS was loading the pyc files instead of the py ones (our distribution contains both type of files), so I generated a distri only with py ones and the modules were correctly loaded by PTVS (py files appears in modules window). But now, I'm stucked with the "Failed to bind" message when setting the breakpoints ... If I pause the debugger the stack is correct, I can see python frames and C++ frames correctly...

I tried with both the custom interpreter configuration or the "meta" module + stock python, but I got the problem for both. Do you have any idea why I got this problem ? I read somewhere it could be caused by PyRun_SimpleFile with unproper filename parameter but in my case it is correct...

The only thing special here is that my python sources files are separated from the distribution ones, and unfortunately the directory layout is completly different btw sources location and distri location (so I need two pyproj files, one for the source layout and the otherone for the distri layout).

That would be terrific if I could managed to make the PTVS mixed-mode debugger work ...

Anyway, thanks for the good work and sorry for the long thread :-) !
Coordinator
Jan 27 at 6:06 PM
Regarding the lack of __repr__ for some objects, can you clarify? The standard repr() function should still work in this case, providing the standard repr that is basically typename+id. And the debugger should be proofed against things like repr() raising exceptions - if that's not the case, then it would be a bug.

On mixed-mode debugging, it shouldn't matter whether .py or .pyc files show up in the Modules window - the breakpoints should bind the same.

You mention that you set up your own sys.path. Does that include some relative paths as entries, by chance? If so, you may be hitting https://pytools.codeplex.com/workitem/1981. This is fixed in the most recent dev build, so you can try that and see if it helps.

The difference in source code directory layout, on the other hand, would not cause problems in 2.0 RTM (unless you hit the relative path issue first), but it might due to the nature of the fix for #1981. Basically, when PTVS matches the .py file path in a breakpoint against a running module, it now walks the file system starting from the source file path, and going up for as long as it sees __init__.py - and compares that against directories in the module's __file__ (or rather, co_filename). So you can have your sources in an altogether different place, but your directory hierarchy for them needs to accurately represent the module hierarchy for breakpoints to work.
Jan 28 at 10:19 AM
Edited Jan 28 at 10:20 AM
Thank you for your answer.

1) I just had to handle the case where there is no __repr__ of type(obj), which is sometimes the case for Python-wrapped VTK objects (I don't know why there is not even standard __repr__ working for these type objects), because I had always AttributeError exceptions raised in visualstudio_py_util.py. It was (around line 175) :
    def _repr(self, obj, level):
        '''Returns an iterable of the parts in the final repr string.'''
        obj_repr = type(obj).__repr__
        ....
and I added simple global exception handling to avoid AttributeError exceptions mentionned above:
        def _repr(self, obj, level):
        '''Returns an iterable of the parts in the final repr string.'''
        try:
          obj_repr = type(obj).__repr__
        except Exception,e:
          obj_repr = None
        ....
2) That's what I thought, but strangely when loading the .pyc files instead of .py files (both are there in distribution !), I got the message regarding the "Symbols not loaded). I have to regenerate a distri only with .py files. Since .pyc files are created at execution, I have the regenerate for each debug session...

3) Yes we are litteraly building sys.path ourselves in the custom interpreter, but every path is absolute. I have to mention that only the "root" path of all directory layout of python scripts is present in sys.path at the end of initialization, not all its subdirectories. But I guess PTVS is able to walk through the subdirectories looking for __init__ files ?

4) That's what I thought, so I created a python project with the py files from distribution, so the layout should be the same as the module hierarchy (the second python project containing sources files is only for edition (Intellisense, Naviguate, ...) and is in another instance of VS)

I wish I had the time to look at PTVS code myself to better help, but unfortunately I don't :-( (I don't even have some for building PTVS from sources)

Thanks for your help!
Coordinator
Jan 28 at 4:24 PM
When you say that you got a message regarding "Symbols not loaded" - was it a dialog box like this?

Image

Python files themselves don't need any symbols, but for mixed-mode debugging, you need symbols for your interpreter (i.e. python27.pdb). If you build Python yourself, then you need to use the symbols produced by that build.

Absolute paths in sys.path should definitely work on 2.0 RTM. PTVS itself doesn't actually walk anything; the reason why relative paths were a problem is because when Python itself loads a module from a relative path, that path ends up in __file__ and the associated code objects (i.e. if the path entry is foo/bar, and the module is baz, then the path of the file will be foo/bar/baz.py - and we didn't correctly handle that case).
Jan 28 at 5:03 PM
Sorry, that was not very clear. What I meant is that when .pyc files are loaded instead of .py files (I can see which type is loaded by looking in the "Modules" debug window), then when setting a breakpoint in any py file the breakpoint is not "lighted", and the message when placing the pointer on it is "Symbols not loaded for this document. The breakpoint will not be hit.". When I manage to make py files loaded instead, setting a breakpoint does not work because "The breakpoint faild to bind ...".

The custom interpreter executable is built with debug info (the pdb file is right next the executable). In the same directory lies the python dll python27.dll (you guessed right about the version). Indeed, we are building python ourselves (python-2.7.5) : what I did is download the symbols using the exact same page you hypertexted in your previous post for the stock python, place them in my compiled-by-myself python bin dir, and referenced this later dir in "Options -> Debugging -> Symbols -> Symbols file (.pdb) locations" in VS. I never had the popup you mentionned, and I see that Python symbols are loaded in the module window.
But maybe you're telling me I have to recompile Python with debug info turned on to have symbols file "produced by that particular build" ? (so that downloading stock symbols is not OK in my case)

I'm pretty sure it has nothing to do with relative paths stuff because you commited on the 17 oct your patch and I got the november beta build (Nov, 15) installed, so I should have your correction in it...

Maybe I should also mention that our extension modules are compiled as static libs, and all these libs are linked inside the custom interpreter. Then, they are loaded by the custom interpreter at its initialization using some initialize_module() calls... I saw, from Python doc :
The file attribute is not present for C modules that are statically linked into the interpreter; for extension modules loaded dynamically from a shared library, it is the pathname of the shared library file.
Maybe the file attribute is not present in this special case (just an idea...)
Coordinator
Jan 28 at 5:51 PM
Ah, the unbound breakpoint message. That one is displayed by VS and we don't control the text, but it is not really accurate. The only reason why a breakpoint may not bind is because PTVS couldn't find a module with filename matching that specified in the breakpoint. The .py/.pyc difference should not affect this, as code objects should always use .py even when loaded from .pyc - but you can check by printing out sys._getframe().f_code.co_filename, this is what the debugger will be matching filenames again. If there's something funky there (i.e. anything but an absolute or relative path to your .py), please let me know.

Symbols downloaded from python.org only work for interpreter binaries downloaded from there. Due to the way VC++ and VS handle symbols, they only apply to the particular binary that they were built alongside with - even if you build the same source with the same exact compiler flags, the new binary will not match the old symbols. So if you're building yourself, then you do indeed need to build Python with debug symbols, and use the PDB produced by your build.

Having said that, if the dialog doesn't appear, that would indicate that the symbols for your custom python27.dll are, in fact, loaded correctly, so it's likely that you're already building with symbols, and VS can locate them without adding explicit symbol paths because you're debugging on the same machine that you build on (when VC++ compiles a binary, it embeds the full absolute path to the produced symbol into that binary). If you are seeing Python modules in the Modules window, that definitely indicates that Python debugger has symbols for the interpreter and is active.

Having your extension modules compiled as static libs should not be a problem, unless you're linking them with your custom python27.dll. The debugger basically treats everything inside that DLL as Python interpreter implementation detail, and everything outside as user code (since normally you don't want to be debugging, say, the implementation of operator + for unicode objects). Linking the modules into the .exe that is the entry point, and which loads python27.dll, is fine.

__file__ (or rather, f_code.co_filename) only matters for Python modules, not for ones written in C (debugging for the latter is handled entirely by the VS C/C++ debugger, and it uses PDBs for everything).
Coordinator
Jan 28 at 6:00 PM
Edited Jan 28 at 6:01 PM
This discussion has been copied to a work item. Click here to go to the work item and continue the discussion.

(this is for the __repr__ bug)
Jan 30 at 9:46 AM
Edited Jan 30 at 9:50 AM
You are right about .py/.pyc. It does not matter at all which type is loaded. However I got another problem, in order for the debugger to load them, I have to start the debug session twice (I don't know why ...) otherwise only symbols from the dll and pyd files are loaded (so the stack contains unresolved frames in Python layer like "python27.dll!PyRun_Exec..."). I also always use "Attach to Process", because using "Debug" => "Start new instance" on my distri pyproj never load the .pyc files no matter how many times I do it...

So, when .pyc files are loaded, I can see in "Modules" window in the "name" column for instance "mymodule.pyc", and in the "path" column the absolute (and correct) path to mymodule.pyc (ie : D:/foo/mydistribution/moduleroot/foodir/mymodule.pyc). If I print sys._getframe().f_code.co_filename in a function of mymodule.py which is in the call stack (obtained because of a breakpoint in C++ layer), I got exactly the correct absolute path to the PY file (D:/foo/mydistribution/moduleroot/foodir/mymodule.py). I got also the expected "[Native to Python]" and "[Python to Native]" transitions in the stack, but in front of Python frames I got "<unknown>"... (though double-clicking on them opens the correct .py file in VS !). But when trying to set a breakpoint in "module.py", I still got the fail to bind message but with the path ('foodir/mymodule.py') in the error message.

You are right about symbols also, they were automatically loaded by VS from the correct place (didn't know pdb files were generated by default when building Python) instead of the one I specified, containing downloaded stock symbols.

I understand that debugging python27.dll is not possible. We don't link extensions in the latter dll anyway, but only on the .exe entry point so no problem on this side...

NB : I could set a breakpoint in a Python module for one mixed-mode debug session yesterday (only on that module though, was not working in others), but I could not reproduce it afterwards in the exact same line of the module, and I have no idea why :(
Coordinator
Jan 30 at 5:17 PM
It seems that I have spoke too soon :( .py/.pyc difference does indeed cause the problem, but that only happens when you attach after the module has been loaded.

The reason is to do with how the module list is filled. Note that "modules" in this context really refer to Python source files, and not to the Python module system - if you execute a source file by any means (e.g. exec), it's still a "module", even though it is listed in sys.modules.

Now, when the module is loaded when debugger is attached, the debugger directly intercepts the creation of the code object for code of that module, and uses information in that object (and specifically, co_filename) to create the VS module object. This works because co_filename is always .py.

However, when you attach to a process that is already running and had loaded some modules, there's no way to enumerate all existing code objects in it. So our fallback is to inspect sys.modules, and look at the __file__ attribute of the objects inside. This breaks down here because __file__ will actually use .pyc for anything other than the entry point module (when using python.exe), and the code that matches breakpoint filenames does not do anything special to match .py vs .pyc. I can reproduce this locally on a very simple project.

This is definitely a major bug, and I'll work on getting this fixed ASAP. In the meantime, the suggested workaround is to modify your host process so that it pauses and waits for input before loading Python; so that you can then attach to it with no modules loaded, and then unpause it and let debugger observe all those modules.

This shouldn't be a problem if running the host directly under debugger. I would like to understand why it doesn't work quite right for you. Can you tell a little bit more about how you had set VS up to run your own custom .exe on F5.
Coordinator
Jan 30 at 5:54 PM
This discussion has been copied to a work item. Click here to go to the work item and continue the discussion.
Coordinator
Jan 30 at 5:57 PM
By the way, you actually can debug the Python DLL when in mixed-mode - it's just not a documented or officially supported feature :) it's mostly there to debug the debugger itself, but it might occasionally be useful if you want to debug your custom changes to the interpreter, or as a convenient way of exploring how it works.

If you want to play with it, here's a .reg file that'll enable it:
Windows Registry Editor Version 5.00

[HKEY_CURRENT_USER\Software\Microsoft\VisualStudio\11.0\PythonTools]
"PythonDeveloper"=dword:00000001

[HKEY_CURRENT_USER\Software\Microsoft\VisualStudio\11.0Exp\PythonTools]
"PythonDeveloper"=dword:00000001

[HKEY_CURRENT_USER\Software\Microsoft\VisualStudio\12.0\PythonTools]
"PythonDeveloper"=dword:00000001

[HKEY_CURRENT_USER\Software\Microsoft\VisualStudio\12.0Exp\PythonTools]
"PythonDeveloper"=dword:00000001
Then look for the two new commands in the "Python" context menu on the Locals/Autos/Watch window (the same one that has "Show Python View" and "Show C++ View" commands).
Jan 31 at 3:06 PM
Indeed, I was wondering why I had two different behaviors when setting breakpoints, as I tried to explain in my first post : if .pyc files are loaded, I don't have any 'failed to bind' popup, I only have the "unlighted" breakpoint icon (the one with the little yellow triangle). In the call stack, I got <unknown> in front of Python frames. If .py files are loaded instead, then when trying to set a breakpoint, I have the popup "failed to bind" with the path to the module indicated and the unlighted breakpoint icon is the one with little red cross. In the call stack, I got the right name of the module in front of Python frames.

I understand your explanation about the difference between the "Debug launched from VS" and the "Attach to Process" modes for PTVS. Indeed, in sys.module, I also observe that the file attribute of any module object refers to the pyc file.

Regarding your workaround, I recompiled my host (custom interpreter) with a Sleep of 15 seconds at the very initialization of it (before loading any extension modules, and before passing the Python entry point module to PyRun_SimpleFile). With this pause, I have all the time to attach the debugger before anything happens. But while the debugger attaches correctly, it still does not work, no python files (py or pyc) are loaded by the debugger although the GUI is loading and working as usual. Only two pyd files are loaded after the Sleep (pyexpat.pyd and another one), but nothing in output window and nothing in modules window regarding my py/pyc files (so only dll and pyd files)...

Here is my configuration for the Python environment using my custom interpreter (refered as "custom.exe")

Path : D:/foo/distri/bin/custom.exe
Windows Path : D:/foo/distri/bin/custom.exe
Library Path : D:/foo/distri/lib
Architecture : x64
Language Version : 2.7
Path Environment Variable : PATH (It is not very clear in PTVS how to set this one... Every debug session I got a warning message displayed by my custom interpreter regarding a wrong PATH - in fact, indicating that D:/foo/distri is not in the PATH, so I don't really know what PTVS is waiting as input for this / is doing with the PATH)

Python distri files pyproj properties :
Startup File : a python file containing the entry point of the Python layer code (GUI)
Working dir : D:/temp
Interpreter : custom.exe

In debug tab :
  • Standard Python Launcher
  • Enable native code debugging : checked (of course)
  • Nothing else set
Pressing F5 or going to Debug => start new instance on pyproj, is leading to the exact behavior I described above (with Sleep(15) in custom.exe) : no py or pyc files loaded at all, only dll and pyd. However, works like a charm in Pure Python debug mode (native code debug unchecked).

Thanks for the reg file, I'll give it a try once I'll have the time. Could be interseting to see the internals indeed !
Coordinator
Jan 31 at 3:18 PM
bde_fft wrote:
Path Environment Variable : PATH (It is not very clear in PTVS how to set this one... Every debug session I got a warning message displayed by my custom interpreter regarding a wrong PATH - in fact, indicating that D:/foo/distri is not in the PATH, so I don't really know what PTVS is waiting as input for this / is doing with the PATH)
We have a (very) long standing issue to fix/improve this... it should be set to PYTHONPATH or whatever equivalent you are using. I believe if you are calling Py_Main in custom.exe then it will be PYTHONPATH, but otherwise this is something you have to implement yourself. (When you press F5/Ctrl+F5 we will clear and populate this environment variable with your project's search paths.)
Coordinator
Jan 31 at 7:59 PM
The lack of any .py/.pyc in the module list would indicate that Python debugger did not successfully attach (.pyd are present because they're really just DLLs and are handled entirely by the native debugger).

I do wonder whether you may have inadvertently exposed some other, unrelated bug in our attach/init code. Can you tell a little bit more about your Python load/init sequence, and where the Sleep goes into it? i.e. do you load python.dll manually, or is it just a regular dependency? Is the Sleep before or after Py_Initialize? Do you use Py_NewInterpeter?

As Zooba mentioned above, the "Python Environment Variable" is the name of the variable that PTVS should fill from the project search path before spinning up the interpreter. So for a regular interpreter, you want to put "PYTHONPATH" there.