PTVS 2.0 Debugging speed improvement, revisited


Discussion of ways to actually improve the speed in the PTVS debugger, rather than workarounds. By 'PTVS debugger', I am referring to whatever PTVS component manages the execution of the Python script when launched under the VS IDE (via F5).

I know this issue has been raised before, but I thought a new issue would bring better attention.

The gist of explanation about the debugging slowness, as I understand it, is that the debugger runs the Python interpreter in a trace mode. I don't know the details, but it seems that the debugger is invoked whenever the current line changes.

The reason this is done is so that when an exception is thrown, the debugger can determine, or at least guess, whether there is a handler for the exception. If the exception is not checked in the Exceptions dialog as 'thrown' but is checked as 'user-unhandled', this will determine whether or not to break. BTW, I have posted a separate issue to clarify what 'user-unhandled' means exactly.

I would like to suggest that this can be determined from code analysis rather than from tracing. I believe the object is to know what try ... except blocks are active on the execution stack. Can't you examine the source code as indicated in the exception traceback and find the try block(s) that enclose each point in the traceback?

Or better yet, could you examine the byte code; this would cover cases where there is no source code available. This would also solve the issue of not knowing about exception handlers where there is no source code.

Previous discussion about the slowness issue has to do with reducing the time spent handling trace callbacks from the interpreter. However, the sheer number of such callbacks makes the debugger much too slow. That is why I have asked if the trace is really necessary.

Alternative suggestion is to eliminate the trace mode in the case that there are no User-unhandled exceptions checked at all. I would think this would be easy to implement. If you do implement this feature, then you should update the documentation to point out (with emphasis) that unchecking all exceptions will result in considerably faster debugging.

Finally, how can I help implement this? At least, can you point me to where in the PTVS source the logic for handling exception breaks is located, so that I could see how complicated a task it would be?

BTW... The trace seems to go so far as to call back for continuation lines in the source code. The result is that a statement that extends over two or more lines takes longer in debugging than if the statement is put into a single line. Even
statement 1; \
statement 2
runs slower than
statement 1; statement 2
with the result that in order to speed up the debugging of very frequently executed code, I have to make it less readable!


pminaev wrote Dec 20, 2013 at 8:17 PM

The slowness is indeed because interpreter is in trace mode, but also because the trace function is itself written in Python (unless you're using mixed-mode), which slows it down even further.

The necessity of a trace func is not just for the sake of exceptions. It's also necessary to allow breaking randomly (Debug -> Pause), for breakpoints, and for all the stepping commands. The problem is that 1) if a particular frame did not have a trace handler enabled, its current line info will be skewed, and 2) Python does not provide any supported means of lazily registering the handler only when needed, because there's no way to walk the other threads' frameobject chains from Python code.

Note also that we are somewhat limited by a desire to have a debugger implementation that works with any Python implementation that uses CPython-compatible API - that's why all debugger tracefunc logic is itself written in Python. So we can't rely exclusively on implementation details like CPython bytecode, since that won't work in PyPy, Stackless, IronPython etc. Having said that, we could have two different code paths in some places, one CPython-specific and optimized, another generic and slow.

We have discussed some of this before. If we decide to optimize CPython only, then we can get line information from the current instruction pointer based on co_lnotab, and ignote f_lineno completely. And for breakpoints, we can patch bytecode of the code object and insert a call instruction at the desired point that would invoke our hook, instead of comparing line numbers on every step in a tracefunc. The result would be zero-overhead debugging where you only pay for things that you use (i.e. setting a breakpoint would only make the func with that breakpoint slower, and only when it's actually hit).

There are some engineering problems with it - e.g. how to find the code object that corresponds to a given (filename,line) pair that defines a breakpoint? Python does not maintain a list of code objects anywhere in memory, unfortunately, so there's no way to enumerate them all. And enumerating modules and functions within them only gets you so far, because there are numerous ways of creating code objects that are not reachable from global scope in any well-defined way, if at all (nested functions, lambdas, eval etc).

pminaev wrote Dec 20, 2013 at 8:24 PM

As far as code goes, you will probably want to start from this:


In particular, look at handle_exception and report_exception functions, and ExceptionBreakInfo class.

Note that this already does walk the source code to find exception handlers (to determine whether the exception should be considered handled or unhandled). Or rather, it tells VS to do the walk, and then goes over the returned handler list and checks exception types. The entrypoint for the bit on VS side which walks the code is HandleRequestHandlers here:


Zooba wrote Dec 20, 2013 at 8:51 PM

I'll also add that the exception handlers are only searched when an exception is being raised, and the results for each file are cached on the Python side to avoid calling back into VS.

Most of the slowdown is due to calling a function on each line, but as pminaev says, this is unavoidable unless we stop supporting non-CPython interpreters (and quite possibly not all versions, and certainly not custom builds - mixed-mode debugging is limited to 2.7.[4+] and 3.3 for this reason).

Zooba wrote Dec 20, 2013 at 8:53 PM

It would probably also break cross-platform debugging, because right now we can deploy the script to a Linux or MacOS box and it just works. A C extension will probably not.