Debugging an embedded, statically linked interpreter

Aug 20 at 7:39 PM
Does the Python debugger require any special magic inside the interpreter to signal that it can be debugged? Ie. does the debugger look for Python33.dll in the modules list, etc? I ask, because my application embeds the interpreter statically; so there is no Python33.exe or Python33.dll, etc; just my app. I do build the interpreter with debug symbols for my debug build, so I can actually step into the native Py* function calls; but I haven't yet discovered the recipe that will allow me to hit breakpoints in my Python script. How does the debugger figure out that my process is Python-debuggable (or how can I get it to broadcast that)?
Coordinator
Aug 20 at 7:46 PM
If you're using mixed-mode Python/C++ debugging, then that requires the interpreter to be linked dynamically - it specifically looks for python??.dll or python??_d.dll to find the functions that it needs to intercept for that to work.

If you're using the regular pure Python debugging, then it depends on how you attach. Attaching via Debug -> Attach to Process will also require python??.dll to be loaded. However, if you use remote debugging (ptvsd), then it doesn't matter how your interpreter is loaded - or whether it is even CPython at all - provided that it can load and run the debugger Python code.
Marked as answer by Petezah on 8/20/2014 at 1:56 PM
Aug 20 at 9:13 PM
Okay, good to know. Thanks for the quick answer!

I wish that wasn't the case though; I'd really rather not require the DLL, and it is going to take a bit more effort to get ptvsd to work in my scenario. For a bit more background, I have customized my interpreter to run inside a Windows Store app package, stripping out functionality that was not WACK-friendly. Since it is part of a winmd DLL, intended to be used by other projects, I wanted to get rid of the additional DLL dependency, since it gets complicated with non-C++ projects that aren't props file friendly (ie. I need to copy the ARM DLL when that target builds, Win32, debug, release, etc etc).
Coordinator
Aug 20 at 9:18 PM
The problem with the lack of DLL is that it gets that much harder to find all the interpreter guts that way. Mixed-mode could hypothetically still do it because it, for the most part, uses symbols to find things (but occasionally it also uses DLL exports) - so adding a configurable parameter telling it where to look would be possible. For regular attach, though, it uses the export table only (and hence doesn't need symbols) - and there simply wouldn't be the export table in your case.
Aug 20 at 9:33 PM
Ah, interesting. It would be really useful to have a switch like you describe for mixed-mode. I would actually love to be able to F5 my project into mixed Native/Python or Native/Managed/Python debugging. Currently in all cases I will have to launch without the debugger, and attach after the fact. Thanks again for the insight. I'll have to rethink my approach potentially.
Coordinator
Aug 21 at 12:12 AM
Edited Aug 21 at 12:12 AM
We actually almost have everything that you'd need for this. In Python project properties on Debug tab, you can specify your own binary under Interpreter Path - you could make this your C++ project output. It will then be launched on F5, paused, and attached to.

The only problem there is that we assume that it is actually an interpreter, and so the command line that we pass to it will include the startup .py file name. Unfortunately, while you can remove the startup file name on the General tab in project properties, we won't let you start the project if it's not set. It seems like disabling this check when a custom interpreter path is enabled would do the trick.
Aug 26 at 3:21 PM
I spent some time and experimented with that idea after you suggested it. I built a win32 console app that consumed my Python lib (modified slightly, since it is not an app container) and imported and used a py module. Then I made a Python project, and tried to launch the console app as the interpreter; but it refused to launch, saying that it wasn't a valid interpreter.
Coordinator
Aug 26 at 4:51 PM
This sounds strange. In can successfully set Interpreter Path in project properties to, say, C:\Windows\System32\notepad.exe, and it launches (with the startup file open, since that was passed as an argument) when I press F5; with "Enable native code debugging" box checked, it also attaches to it. If we can launch & debug Notepad like that, it would seem that any other binary should also work.

Make sure that Launch mode is set to "Standard Python launcher" in project properties for you? If that is not it, can you describe your setup in more detail (a screenshot of project properties would probably be most descriptive here).
Coordinator
Aug 26 at 5:07 PM
The output from "Tools -> Diagnostic Info..." will also include some details about the currently open project, so you could email that to ptvshelp@microsoft.com (with a reference to this discussion so we know where it came from) as well.
Aug 27 at 9:53 PM
At your response, I dug deeper to see if it was a mistake, and it turned out to be a silly mistake. Though, this could be a bug. I obtained the path to my built interpreter executable using Explorer's "Copy as Path", which sticks quotes around the path. I pasted it in, as is, since usually quotes are okay in URIs. Removing the quotes allowed it to launch my program.

So, on that note, it still didn't give me the mixed python debug experience. So I temporarily switched my Python build over to be a DLL project instead of a static Lib, and got a different problem. However (and this was exciting), I tried instead with my test Windows Store app, linking in my custom Python build, now a DLL. Since I can't launch the Python mixed debugger from the native project properties, I executed the AppX, used Attach to Process, and saw that it did recognize that there was a Python environment inside the process! Using the mixed debugger, I was able to inspect PyObject* variables using the [Python] node. Really cool. What didn't work though, was that it wasn't able to find (or associate) my py module with the one in the process; so I couldn't step through Python code.

But I'm starting to wrap my mind around how this works now, and I think I can see why there is a limitation here. I'm guessing for performance reasons, the debugger looks for the Python DLL to avoid searching every module for the right symbols. I think a useful addition would be the ability to specify additional assembly names (possibly in the options).
Coordinator
Aug 28 at 12:15 AM
Yes, it specifically searches for python??.dll or python??_d.dll (and even then for supported versions, so 27, 33 or 34 right now), and only then requests the symbols for that particular module.

If you want to experiment with tweaking that, you can try building your own version of PTVS with changes. The build instructions are really straightforward, and for this you'd only need VS and VS SDK to build the core projects (and ignore all the HPC etc stuff that drags in other dependencies). Here is the code that detects whether a particular loaded module is a Python runtime DLL - it should be fairly trivial to add your .exe name to that list. If that works, I can help you work this into a full-fledged feature where the module name is defined as a property in the project file, and then we'll merge it into the product.

We haven't tried AppX debugging with this, but it's probably not really any different. The reason why the code might not be showing up properly is because the files in the project being debugged have paths different from what it has at runtime (because it's deployed to its container before being launched). Still, I did some work specifically to handle such a mismatch, for remoting purposes, so I would be interested in pursuing this further. Can you check what the full paths of the modules in the debugger Modules window (not shown by default, but you can enable it in Debug -> Windows) look like?
Aug 29 at 2:29 PM
I'm still experimenting, but I wanted to post an update for some other questions you had.

I took a look at the modules window (it didn't occur to me that py modules might show up there, which is a neat feature), and here is the operative line:
pytest.py   C:\Users\Peter\Source\Repos\Python\PCbuild\Build\Debug\Win8\Win32\Test_Windows8\AppX\Script/pytest.py   N/A N/A Symbols loaded. C:\Users\Peter\Source\Repos\Python\PCbuild\Build\Debug\Win8\Win32\Test_Windows8\AppX\Script/pytest.py   1               [11336] Test_Windows8.exe   
I thought it was interesting that the slashes are not normalized; but I'm guessing this isn't the problem. Is it possible the debugger doesn't have permission to load the file out of the app container? Still, it is odd that I can open the file from the project directory and set a breakpoint that appears to be active (it lights up when the debugger is running, and seems to indicate it is linked up where it needs to be). Additionally, the debugger actually seems to stop at the breakpoint, but is unable to open the file--I get the "can't find sources/disassembly" page that normally shows up in native debugging when PDBs, etc, are not available. I tried the browse link from there also, and it seemed to do nothing useful.

As for the experiment, I need to dig a little deeper, but adding code to make GetPythonLanguageVersion return a valid version when it hits my module makes some, but not all the plumbing work. I can see that it does find symbols and information (interesting was the static "initialized" variable that I saw it pull successfully out of pythonrun.obj), but the watches window doesn't give me the [Python] node yet. I'll let you know as I find more.
Coordinator
Aug 29 at 7:47 PM
The debugger runs inside VS, so it would have the same access as VS itself. So you can try opening the file from that path in VS editor to see if it has access. I just tried and, indeed, accessing the apps required me to elevate...

Having said that, this path doesn't look like it's coming from the app container to me - it doesn't start with C:\Program Files\WindowsApps, which is where the individual containers are. I wonder if that is actually the problem. Could it be that you end up deploying .pyc files that are compiled with the development paths, but then run from the app container? I'm not quite sure what the implication of such a mismatch would be, but things might break.

It might also be interesting to look at ModuleManager.FindDocuments - that's the bit that matches filenames (e.g. the one in breakpoint vs the one in the loaded module). It does some filesystem walking there in case the project and the runtime are not off the same directory, to match files regardless. This should cover your case, but placing a breakpoint there and stepping through might provide some clues as to what is going wrong.

For debugging the statically linked interpreter, the Modules window also serves as a helper indicator that the Python debugger considers itself fully initialized. It won't fill the list with loaded .py files until such time that it can report itself as a distinct runtime to VS, and it won't do so until everything is in place. So if you don't get any Python modules in that list, it means that it's still waiting for something to complete. NativeModuleInstanceLoadedNotification.Handle has some code that handles loading of the main Python interpreter module (which should be your .exe in this case), and the injected debugger helper DLL. Setting the breakpoints on both might help figure out what's going on there.
Coordinator
Aug 29 at 8:37 PM
As far as the path is concerned, when debugging Windows Store Apps they are run out of the AppX folder in the project directory. I don't recall whether this is true for Ctrl+F5 in VS, but it seems unlikely they'd be deployed to an All Users location (LocalAppData is more likely).