Debugging is slow in some cases

Sep 7, 2011 at 11:52 AM

Hi,

The difference in speed between debugging and running from the command line seems to be pretty big when using Suds (a soap library for python). Creating a soap client (e.g. 'client=Client(url_of_wsdl)') is almost instantaneous when run from the command line, but takes about 50 seconds when running from within Visual Studio.

I understand that debugging will always be somewhat slower than running from the command line, but the difference is quite large here. I also tried to run the same code from within NetBeans 7.0.1, this takes about 3 seconds, which is perfectly acceptable.

Is there anything I can do to speed up VS?

Best regards, Berend Veldkamp

Coordinator
Sep 8, 2011 at 2:09 AM

If you open the Exceptions dialog (found under the Debug menu) and uncheck the topmost Python exceptions item you may see some improvement.

Typically, slow code under debugging is the result of lots of exceptions being thrown and handled. Because we want to break on an unhandled exception before the stack unwinds, we analyse the source files to determine (make a pretty good guess) if there is a handler for it (if we let Python unwind until we're certain that there are no handlers, we've already lost the stack frame and local variables where the exception was caused).

An alternative is running without debugging (Ctrl+F5), but in this case you don't get breakpoints or the other debugging features.

Out of interest, is Suds installed in your site-packages directory? We may be able to provide an option to assume exceptions thrown inside Python\Lib are handled (at least until they reappear at user code).

Sep 8, 2011 at 7:07 AM

Zooba,

Thanks for your fast reply.

None of the Python exceptions is checked, so I don't think that's the problem.

Ctrl+F5 is an option, but that kind of defeats the purpose of using VS in the first place.

I simply installed Suds with the Python setup tools; the egg is installed in the site-packages folder.

Coordinator
Sep 8, 2011 at 10:27 AM
bveldkamp wrote:

None of the Python exceptions is checked, so I don't think that's the problem.

Even the 'User-unhandled' column? If that's unchecked we shouldn't be doing anything to slow down debugging, at least if the cause is lots of exceptions.

Sep 8, 2011 at 12:04 PM

So that column was missing (http://vaultofthoughts.net/UserUnhandledCheckboxMissingFromExceptionsWindow.aspx)... After resetting it, I see that all Python exceptions have User-unhandled checked, but they are also disabled, so I cannot uncheck them.

Thanks for your patience.

Coordinator
Sep 8, 2011 at 9:53 PM

This sounds like some sort of configuration issue, which a reinstall (maybe just of Python Tools, possibly of VS) may fix, but I'd prefer to solve it than just fall back on that :)

Do you know which set of settings you selected on the first run (General Developer, C++ Developer, etc.)? (If you don't, does your New Project dialog have an "Other Languages" category, and if so, which language isn't in there?) The Just-My-Code setting is off by default if you select C++ settings, but once it's turned on it all works fine (for me, at least).

Can I clarify whether all Python exceptions have User-unhandled checked? There should be five that aren't (AttributeError, GeneratorExit, IndexError, KeyError and StopIteration).

Could you also check that the Software\Microsoft\VisualStudio\10.0_Config\AD7Metrics\Exception\{EC1375B7-E2CE-43E8-BF75-DC638DE1F1F9}\Python Exceptions\State registry value (probably in HKEY_LOCAL_MACHINE, but maybe CURRENT_USER, depending on how you installed it) is 0x4020? Here's a cmd.exe copy-paste for you:

reg query "HKLM\Software\Microsoft\VisualStudio\10.0_Config\AD7Metrics\Exception\{EC1375B7-E2CE-43E8-BF75-DC638DE1F1F9}\Python Exceptions" /v State
reg query "HKCU\Software\Microsoft\VisualStudio\10.0_Config\AD7Metrics\Exception\{EC1375B7-E2CE-43E8-BF75-DC638DE1F1F9}\Python Exceptions" /v State

Assuming all that checks out, we'll probably need to grab some more info about your configuration - John should have a script that can help with that. Meanwhile I'll install suds and see if I can reproduce the perf issues.

Coordinator
Sep 8, 2011 at 10:07 PM
Edited Sep 9, 2011 at 2:53 AM

Actually, here's a simpler question: do you have PTVS 1.0 final installed? I can reproduce everything perfectly with beta 2, including the User-unhandled column being completely disabled.

The performance issue was known with beta 2 and was fixed for RC1, and breaking on user-unhandled exceptions wasn't added until a later version.

The easiest way to check the version is through the Programs and Features dialog. If the product version starts with anything other than "1.0" (beta 2 was 0.8.40406.0, for example), hit uninstall and blame whoever did the uploading for not putting the version in the filename vote for this issue: http://pytools.codeplex.com/workitem/488.

Sep 12, 2011 at 7:06 AM

Zooba,

I fixed it by installing the latest version of PTVS. I had checked the version earlier, but I only looked at Help|About in VS, which claimed 1.0, so I thought I was on the latest version already. The Add/Remove Programs dialog showed 0.8.40406.0.

Thanks again for your fast support.

May 29, 2013 at 8:46 PM
Hi sorry for digging up this old thread,

I just spent a whole day trying to find why some ported code was running so much slower in Python than the original in Matlab.
I finally figured out that it was Visual Studio, slowing down my code by a factor of 10 when using F5 compared to Ctrl+F5! (Numpy math functions being slower for scalars than math module equivalents was another factor of 4).

I have a few questions:
1) Does this sound reasonable? Or do you think I have something misconfigured?
2) I tried turning off the first few exceptions as was suggested by Zooba, but this didn't make any difference. Which exceptions should I focus on switching off?
3) Is there any easy way to toggle on/off the analysis of exceptions in VS, without affecting the breakpoints?

I have the latest versions of VS2012 Ultimate and Pytools1.5 on Windows 7 x64, and have confirmed that the correct 5 exceptions have User-unhandled unchecked.
My new project dialog has templates for: VB/C#/C++/F#/SQL Server/Documentation/Python/Light Switch/Other project types/Modeling Projects.
May 30, 2013 at 3:19 AM
Edited May 30, 2013 at 3:34 AM
Since I haven't gotten a reply yet, here's some somewhat simplified source code reproducing the issue.
On my PC, running with F5 takes 10x longer than running with Ctrl+F5.

Unchecking user-unhandled for ALL python exceptions does not fix the problem, so it feels like it may be a configuration error or bug.
from scipy import constants as spConsts
from scipy.integrate import quad
import numpy as np
import math
import timeit
import os

class semiconductor(object):
    def __init__(self):
        self.Eg=1.424                   # energy bandgap of GaAs [eV]
        self.mc=.0665*spConsts.m_e      # electron effective mass in GaAs [kg]
        self.T=300.0                    # temperature [K]
        self.Etol=20*spConsts.k*self.T
    def concentrationE(self,Fc):
        """ electron concentration"""
        Fc=float(Fc) # numpy scalars are slow, avoid them like the plague. Commenting this line gives 5x speed reduction
        integrand=lambda E:self.DOS(E,self.mc)*self.fermi(self.Eg+E,Fc)
        integral,err=quad(integrand,0,Fc-self.Eg+self.Etol) 
        N=integral*spConsts.e       
              
    def DOS(self,E,m):
        """ Density of states in bulk"""
        rho0=1.0/2/np.pi**2*(2*m/spConsts.hbar**2)**1.5
        try:
            # assume E is scalar
            rho=rho0*math.sqrt(spConsts.e*E)
        except ValueError:
            # assume E<0
            rho=0.0
        except TypeError:
            # assume E is numpy.ndarray
            rho=rho0*np.sqrt(spConsts.e*E*(E>0))
        return rho
        
    def fermi(self,E,F):
        """ Fermi distribution given energy and fermi level [eV] and temperature [K]"""
        try:
            # Assume E and F are both scalars... math.exp(x) is 5x faster than numpy.exp(x) if x is a scalar
            f=1.0/(1.0+math.exp((E-F)*spConsts.e/spConsts.k/self.T))
        except TypeError:
            # Assume E and F are numpy arrays
            f=1.0/(1+np.exp((E-F)*spConsts.e/spConsts.k/self.T))
            f[self.T==0 and E==F]=1
        except OverflowError:
            # If overflow error then set exponential to inf
            f=0.0
        except ZeroDivisionError:
            f=1 if E<=F else 0            
        return f  
        
if __name__=="__main__":
    currModule=os.path.split(os.path.splitext(__file__)[0])[-1]
    setup='from %s import semiconductor; obj=semiconductor(); import numpy; Fc=numpy.array([1.5])'%currModule
    n=5
    m=1000
    t0=min(timeit.Timer('obj.concentrationE(Fc)',setup=setup).repeat(n, m))/m
    print 'concentrationE took ' + str(t0*1000) + ' ms'
    ans=raw_input("Press any key to continue...")
Coordinator
May 30, 2013 at 3:22 PM
This is basically the expected behaviour. When we attach our debugger, we use sys.settrace which is known to slow programs down. The issue discussed earlier in this thread was resolved in PTVS 1.0, so it doesn't really apply.

That said, unchecking the exceptions that are being thrown should slightly improve performance, though I would be very surprised if it recovered the 10x difference. We do have one known issue with dictionaries, but that doesn't seem to apply to your code.

It's likely to be numpy interacting poorly with the debugger, but we'll need to investigate further to figure out why, and even then it may not be resolvable. (You could also try testing with our 2.0 alpha to see whether it has improved at all. It's been out for over two months now and seems to be pretty stable.)
May 30, 2013 at 3:45 PM
Hi Zooba,

Thanks for your reply!
In this case, the exceptions are basically not doing anything. I think the code should run fine without any of the except blocks, I just put them in for completeness. In that case the only method which is run that isn't part of the standard library is scipy.integrate.quad.

In Matlab, the debugger doesn't seem to slow execution down at all. Could it be possible to provide an option in PTVS to use a more lightweight debugger with a reduced feature set if need be, but with close to (i.e. within 20-30% of) native performance, like Matlab?
Coordinator
May 30, 2013 at 4:05 PM
Matlab has an advantage in that they control their language, and they don't actually debug inside most of the functions. I haven't looked into the specific functions you're using, but we will (at least try to) debug any Python code. We also don't have the advantages of hardware support that languages like C++ use - we have to check every statement as it happens to see whether it has a breakpoint set. Basically, getting close to native Python performance is unlikely to happen.

We do have the option to implement the Python parts of our debugger in C, but we're hesitant to do this because it adds a huge compatibility burden. Currently, our debugger is portable to any Python interpreter (we can even run that part of our debugger on Linux or MacOS), as well as being patchable by users.

However, we are hoping to ship some new debugging support shortly that may work better for you, especially if you only need limited functionality. Keep an eye out for our Beta release...
May 30, 2013 at 5:24 PM
Edited May 30, 2013 at 5:25 PM
I see, thanks for the detailed explanation.

It sounds like there are several places you could focus on to get better performance. You might like to check the pydev debugger, as judging by the accepted answer on this thread on stackoverflow (from a Pydev developer), they have optimized it quite well:

http://stackoverflow.com/questions/9346622/what-determines-debugger-run-time-performance

Anyway, I'll definitely check the beta as soon as it's out and report back here if there's a big speedup.
Coordinator
May 30, 2013 at 5:41 PM
Note that one of the things this SO comment calls out is that "making a compiled module for the debugger could probably also make it faster". This is indeed correct, and in fact gives a very significant perf boost right away, both because the trace function is then that much faster, but also because trace/profile hooks in CPython are actually optimized for native rather than Python callbacks. Of course, as he notes right away, this makes it only work with Python implementations that support its extensibility API.

Now, as part of the work to implement mixed-mode debugging (https://pytools.codeplex.com/workitem/210), we are actually coming up with a new debugger implementation (only for mixed mode, not for the regular one - but there's nothing precluding you from using it do debug just Python code) that is 100% native code, with the performance increase that entails. It will be limited in other ways, due to various restrictions that dealing with native code places on how we can interact with Python runtime, but if performance is critical you might find it useful regardless. I'll try to run your code using it and will get back to you with numbers, so that you know what to expect.
May 30, 2013 at 6:31 PM
Yes please, I would appreciate your benchmarks comparing mixed/normal/no debugger with my code, thank you!

Note: Fabio also says below that not tracing files where there is no breakpoints gave the biggest speed up in PyDev:
http://pydev.blogspot.jp/2005/10/high-speed-debugger.html

P.S. Would the mixed-mode debugger work with Cython code at all? Currently if I set a breakpoint anywhere in a class which calls cython code from an external pyx cython module, I get an error that "The current thread is not currently running code or the call stack could not be obtained". I.e. even if the breakpoint is not in the cython file, debugging still fails.
Coordinator
May 30, 2013 at 7:15 PM
I've got some interesting numbers from this code. With the current code, I'm seeing an x3.75 slowdown with debugger attached; however, vast majority of that comes from exception reporting for things like AttributeError and IndexError. If I disable code that catches exceptions, though, the slowdown is reduced down to x1.5 (and you still get breakpoints with that). Unfortunately, I don't see a way to optimize the exception code - the cost here comes not from reporting the exception, but from pausing the process when it is reported (basically, whenever PyErr_SetObject returns). The only thing I can think of here is to provide a switch that disables exception support.
Coordinator
May 30, 2013 at 7:26 PM
Mixed-mode should work with any kind of native code that can be handled by VS native debugger, really - which is pretty much anything (if it doesn't have symbols for the called native function, it will still let you step in and will show assembly).

Cython in particular ultimately just produces C, so it should work. Furthermore, if you generate C code using --line-directives, then PDB files produced by the C compiler should map directly to your Cython source code, so that's what you'll be seeing when you step from Python into Cython. It will still show local variables etc as they exist in generated C code, though, and stepping will have C semantics. When using mixed-mode, in C/C++ (and therefore Cython) you will also get Python projections of PyObject and friends for many common built-in Python types, so that should help a little bit.

Dedicated Cython debugging support (e.g. unmangling function and variable names in Call Stack and Locals, hiding all the scaffolding etc) is definitely doable, and it is something that we're considering for future releases, but it won't be there for the first version of mixed-mode debugging. We have a feature request for Cython support as far as editing, Intellisense etc goes (https://pytools.codeplex.com/workitem/542), but it doesn't mention debugging right now, and I think this should be tracked as a separate feature, so feel free to create a new feature request in the tracker!
Coordinator
May 30, 2013 at 9:04 PM
apperception, I've created a feature request for this feature: https://pytools.codeplex.com/workitem/1228 (disable exception support).
May 31, 2013 at 12:20 AM
pminaev, thanks for the numbers! By x3.75 slowdown with "the current code" I assume you're talking about the mixed-mode debugger. Can you please confirm that you also see ~10x slowdown when using the standard debugger, to exclude differences between our two setups?

As part of the disabling exception support option, could you squeeze more out by doing something like in the link of my last post? It would be nice if the code ran essentially at native speed if when this option is enabled, F5 is pressed and there are no breakpoints.
Coordinator
May 31, 2013 at 12:42 AM
Yes, I do see a significant slowdown with our current debugger, about x10.8; and x3.75 is indeed with the upcoming mixed-mode debugger in its present shape.

I don't think there are considerable gains to be had by implementing the scheme proposed in that SO post for a native debugger - it already has an extremely low overhead for breakpoint checks, especially when you have no active breakpoints (it would then amount to a single null pointer check) - the overhead of Python bytecode interpreter loop will dwarf that. The reason why the overall overhead is still considerable is that, when a trace function is registered, Python disables certain optimizations that it otherwise has in the aforementioned interpreter loop, and it also needs to start keeping track of line numbers to report them. This can only be fully prevented by not registering a trace function at all, on any frame in the stack - but we need it for breakpoints, even if it just immediately returns after a quick check.

I do have some ideas about how breakpoints could be implemented without relying on trace func to give virtually zero-overhead debugging (even with breakpoints set!), but it's only a vague notion and not a detailed plan, and in any case it would be a major feature requiring considerable development time - so there's no chance of that getting there in this release.
Dec 12, 2013 at 3:53 PM
Since this post has been a few months and both VS and PTVS probably got new version came out. I would just like to confirm to have the similar slow debugging issue with VS 2012 and PTVS 2.0 (with numpy in my code as well, however I have a feeling even without numpy, the issue still remains, user-handled exceptions all unchecked).
It would be nice for the debugger to speed up a little.