Debugging Python script with an application embedding Python interpreter.

Editor
Sep 16, 2013 at 1:35 PM
Hi,
I am using Python 3.3.1 in VS 2012 Premium. I have some queries. I have an application (An exe), which embeds Python interpreter in itself. I chose a Python script using file browser and internally it runs the script using PyRun_String function. The scripts usually call some APIs which are exposed using pyds to automate the application. The application loads python libraries dynamically.
Now, if I want to debug any script, how to do that? I have tried to attach the application in Native/Python mode, but it is not working. The script cannot be debugged directly as there are many environments which the application sets at the time of running the script. Please help me.
  • Debarshi
Coordinator
Sep 16, 2013 at 5:01 PM
You don't really need to do mixed-mode (Native/Python) unless you intend to debug the C/C++ code in your .pyd modules. You can also attach using Python alone even if it's a binary hosting Python (so long as it's a supported version of CPython - 3.3.1 qualifies).

The problem is PyRun_String. When you use that, the code object that is implicitly created does not have any source filename associated with it, and so we can't map it to the original source at runtime - hence, it will show up as "<string>" in Call Stack, and breakpoints inside it will not work.

If you are loading the source directly from disk files, consider using PyRun_File instead, and supplying it the correct filename. If you must load the code yourself and only have a string with the source to supply, then use Py_CompileString (again, providing the correct filename) followed by PyEval_EvalCode.
Editor
Oct 8, 2013 at 11:22 AM
Thanks a lot.
Editor
Oct 10, 2013 at 1:10 PM
I am still not able to debug. This is not because of PTVS, but because of Python and Windows. Python has a bug in Windows and PyRun_File and Py_CompileString crashes if crt version of application is not same as Python supports (http://bugs.python.org/issue10082). This is creating a huge problem for me as I am using VS2012 and Python 3.3.2 is build on VS2010. I don't know if there is any fix exists, but I have visited many blogs and posts, no solution is working for me. I also used Python macro _Py_fopen() which returns a FILE*; I can see it is of mavcr100 type and passing it directly to PyRun_File, but still it is not working and it surprised me.

Do you have any idea of workaround in this issue? It would be a great help for me.
Editor
Oct 10, 2013 at 1:13 PM
Py_CompileString also crashes for the same reason (I can see the memory dump which is same as PyRun_File), but I don't know why even if in this API we don't need FILE pointer to supply.
Coordinator
Oct 10, 2013 at 7:40 PM
Can you share the callstack for the crash that you're seeing with either one?
Coordinator
Oct 10, 2013 at 7:47 PM
Another option is to install the Windows SDK for Windows 7 and .NET 4.0, which includes the version of MSVC you need.

Once you've installed it, you'll be able to choose this compiler through the Platform Target property of your C++ project. Everything should still work in VS 2012 - you don't need to go back to VS 2010 - but you will be using an older version of the compiler that will be missing some features.
Editor
Oct 18, 2013 at 10:03 AM
Edited Oct 18, 2013 at 10:04 AM
Here is the call stack of the crash.
python33.dll!PyUnicode_InternInPlace(_object * * p) Line 14220 C
python33.dll!new_identifier(const char * n, compiling * c) Line 570 C
python33.dll!alias_for_import_name(compiling * c, const _node * n, int store) Line 2808 C
python33.dll!ast_for_import_stmt(compiling * c, const _node * n) Line 2892  C
python33.dll!PyAST_FromNode(const _node * n, PyCompilerFlags * flags, const char * filename, _arena * arena) Line 724   C
python33.dll!PyParser_ASTFromFile(_iobuf * fp, const char * filename, const char * enc, int start, char * ps1, char * ps2, PyCompilerFlags * flags, int * errcode, _arena * arena) Line 2124    C
python33.dll!PyRun_FileExFlags(_iobuf * fp, const char * filename, int start, _object * globals, _object * locals, int closeit, PyCompilerFlags * flags) Line 1931  C
It fails in PyDict_GetItem() call in PyUnicode_InternInPlace() function. Here is the snippet -
/* It might be that the GetItem call fails even
   though the key is present in the dictionary,
   namely when this happens during a stack overflow. */
Py_ALLOW_RECURSION
t = PyDict_GetItem(interned, s);
Py_END_ALLOW_RECURSION
I am not sure why PyDict_GetItem() is failing because of FILE*. It seems to me that Python is able to parse file provided successfully.
Editor
Oct 18, 2013 at 1:07 PM
Reported an issue in Python -
http://bugs.python.org/issue19283
Coordinator
Oct 18, 2013 at 5:25 PM
If the wrong kind of FILE (i.e. from different VC version) is passed, you can expect all kinds of process state corruption if it tries to write to a memory block using incorrect layout. In the most extreme cases, this can result in some internal heap management structures being overwritten, and then new heap allocations or releases will mysteriously fail sometime later.

OTOH, you've mentioned that you're using _Py_fopen earlier, and that should just work...
Editor
Oct 21, 2013 at 2:06 PM
Finally I am able to call PyRun_File with the help of _Py_wfopen returned FILE*. I called all Python functions dynamically and it worked. Probably static linking brings the Py function from Python lib, which are build in VS2010 to my dll, which is build in VS2012 and this creates a set of incompatible code (I guess for changes and optimization in struct fields ) in my dll. Dynamic loading loads the Python dlls and that executes totally in VS2010 environment. I am also setting file attribute before running the file (https://pytools.codeplex.com/discussions/462163). But I am not able to hit break point in Python source. I am trying to attach the application (it implements SubInterpreter) to my Python projects. It attaches successfully, but break points are not getting activated. Here is the way I am following -
  1. Initialize the interpreter.
  2. Get Thread state
  3. Create a new subinterpreter.
  4. Swap the thread state with newly created interpreter.
  5. Import and add__main__ module.
  6. Get global dictionary from imported main module and set attribute __builtins__(PyEval_GetBuiltins) and file
  7. Get FILE* from _Py_wfopen
  8. Call PyRun_FileExFlags
  9. Kill the sub interpreter using Py_EndInterpreter
  10. Swap the ThreadState with main Thread
Would you tell me where the problem is?
Coordinator
Oct 21, 2013 at 5:45 PM
Setting __file__ directly is not particularly helpful, since the debugger doesn't actually look at it - what it uses is the co_filename attribute of the code object associated with your functions. It just happens to be a convenient way to check the value because the normal Python import machinery initializes both.

In your scenario, co_filename would be defined by the "filename" parameter that you have passed when calling PyRun_FileExFlags. If you're passing NULL, then co_filename will be unspecified, and breakpoints cannot be hit. Also, if you're passing a relative filename there, you're likely to be hitting https://pytools.codeplex.com/workitem/1981 - note that while this is fixed, the fix is post-2.0, and there isn't a build that includes it yet (but we will be doing regular unstable dev builds from the trunk shortly; building yourself is also fairly easy if you only do it for the core product).
Editor
Oct 22, 2013 at 10:32 AM
Hi,
I am passing correct filename to PyRun_FileExFlags and I can see the full file path in inspect.currentframe().f_code.co_filename. I guess this is the filepath you are talking about. But still I am not able to hit the break points. It is saying "Break points will not currently be hit. No symbols have been loaded for this document". For your information I am customizing site using sitecustomize.py to handle my pyd import in my own way. Even I tried to add the module in sys.modules, but it is not working at all. My debugging process is -
  1. Add a prompt message box in py script.
  2. Run the script for the exe.
  3. When it prompts, start python debugging and attach the exe.
  4. Break points are not getting activated.
    I am not understanding where the problem is. Is it because of site customization or it may be problem of my debug setting? Do we need to add searchpath to the location of sitecostomize.py or change the working directory or add references?
Coordinator
Oct 22, 2013 at 11:27 AM
We don't really get involved in the import machinery, so customizing it should be fine. On attach, sys.modules is inspected to get the list of currently loaded modules, and beyond that the debugger intercepts the creation of new code objects directly, so even the most exotic eval/exec scenarios should work so long as filename is provided correctly.

One thing of note that is not mentioned in the description for #1981 is that it also incorrectly tries to match filenames in a case-sensitive way; so an absolute path that only differs by case would still not match.

If that is not the problem in your case, then it would probably be most efficient if I can try to directly reproduce the issue with your code, and debug it. Is it possible for you to share your project? If so, please send it to ptvshelp@microsoft.com. Thanks!
Editor
Oct 23, 2013 at 7:28 AM
Edited Oct 23, 2013 at 7:28 AM
Coordinator
Oct 23, 2013 at 6:57 PM
Thank you for providing the project. I tried to reproduce the issue with it, but the breakpoints did get resolved for me when attaching at the point when it says "Enter something:" and waits for input. I'm using stock Python 3.3.2 x64, obtained by running the installer downloaded from http://python.org. My attempted repro steps were as follows:
  1. Add the directory for PDB files for Python 3.3.2 x64 to symbol path in Tools -> Options -> Debugging -> Symbols.
  2. Edit the file path in Embedding.cpp to point to Test.py on my machine (since folder layout is different).
  3. Build Embedding.vcxproj in Debug|x64 configuration.
  4. Run Embedding.exe from command line and wait until it prints "Enter something".
  5. Attach to Embedding.exe from VS, manually selecting "Python" and "Native" code types in the Attach to Process dialog.
  6. Wait until all Python modules are detected (you can see the progress printed out in the Output window, or observe it in the Modules window).
  7. Set a breakpoint on the line with print immediately after input. It lights up.
  8. Type something in the console and press Enter - the breakpoint is hit.
Is there anything in the steps above that is different from the way you're doing it?
Editor
Oct 24, 2013 at 7:50 AM
Hi pminaev,

Finally able to hit break point in Python script. Thank you for your cooperation. PTVS is really a great application from embedding prospective. The reason behind not hitting break point was, I kept Managed mode enabled while attaching the process. It works fine now after removing Managed mode.
Coordinator
Oct 24, 2013 at 8:23 AM
This is interesting. So, to clarify - you had Native, Managed and Python all enabled together before, and weren't hitting breakpoints, but when you removed Managed and only left Native+Python, it started to work?
Editor
Oct 24, 2013 at 9:29 AM
Yes exactly. If add Managed mode once again and the keep existing mode Managed+Native+Python, break points are not enabled, but if I remove Managed mode means if my mode setting is Native+Python, break points are enabled. This is quite interesting, I don't know the reason.
Coordinator
Oct 25, 2013 at 6:54 PM
Thank you for all the clarifications. It seems that managed/native/Python debugging does not work on VS 2012, but does work on VS 2013 (and we only tried it on the latter). This is because VS 2012 still uses the old monolithic managed/native debugger that does not use the new debugging APIs that permit arbitrary mixing. I'll rename the bug you've created to be more indicative of the nature of the problem; the fix here will be to not allow selecting Python and managed code types together in VS 2012, to prevent confusion.