[feedback-request] Remote debugging API design review

Coordinator
Jan 25, 2013 at 10:07 PM
Edited Jan 25, 2013 at 10:21 PM

We are implementing the cross-OS remote debugging feature (http://pytools.codeplex.com/workitem/536) for PTVS 2.0. Similar to how it is done in other Python IDEs like PyDev and Komodo, it requires the code being debugged to be modified to establish connection to the debugger. We would like to get community feedback on the API that will be used for this.

Here's a minimal code sample demonstrating the necessary modifications:

import ptvsd
ptvsd.enable_attach(secret = 'joshua')
ptvsd.wait_for_attach() # optional
# Your code follows
...
ptvsd.break_into_debugger() # explicit breakpoint
...

After doing the above, you can pick the new "Python remote debugging" transport in "Attach to Process" dialog in VS, specify the secret and hostname in the Qualifier textbox, and attach.

Note two things here. First of all, unlike PyDev, the direction in which the connection is established is reversed: the debugged program is the server, and IDE is the client that connects to that server. This allows connecting to the same script from different machines without having to change it to update the hostname of the IDE debugging server. To ensure that only authorized users can connect, the debugging server requires you to specify a secret value, and only those users who provide that value when attaching will be allowed to debug the process. This can be explicitly disabled if desired.

The other difference is that the core API function - enable_attach - registers the necessary trace handlers and launches the debugging server on a background thread, but it does not block program execution. So, after you call it, your script continues to run as usual - except that now you can attach to it at any later time. Furthermore, debugging server remains running, which means that you can detach from that process and re-attach at will.

For scenarios where it is desirable to wait until a debugger is attached, a separate function - wait_for_attach - is provided. 

For scenarios where the process being debugged is on a machine that is physically on a different network, or on the same network which is not guaranteed to be secure from eavesdropping and MITM attacks, the debugging server supports SSL connections - in this mode, a certificate file and a key file have to be provided to enable_attach (same format as used by the standard ssl module).

Here are the actual function definitions with their docstrings:

def enable_attach(secret, address = ('0.0.0.0', 5678), certfile = None, keyfile = None, redirect_output = True):
    """Enables Python Tools for Visual Studio to attach to this process remotely to debug Python code.

    The secret parameter is used to validate the clients - only those clients providing the valid
    secret will be allowed to connect to this server. On client side, the secret is prepended to
    the Qualifier string, separated from the hostname by '@', e.g.: secret@myhost.cloudapp.net:5678.
    If secret is None, there's no validation, and any client can connect freely.

    The address parameter specifies the interface and port on which the debugging server should listen
    for TCP connections. It is in the same format as used for regular sockets of the AF_INET family,
    i.e. a tuple of (hostname, port). On client side, the server is identified by the Qualifier string
    in the usual hostname:port format, e.g.: myhost.cloudapp.net:5678.

    The certfile parameter is used to enable SSL. If not specified, or if set to None, the connection
    between this program and the debugger will be unsecure, and can be intercepted on the wire.
    If specified, the meaning of this parameter is the same as for ssl.wrap_socket. 

    The keyfile parameter is used together with certfile when SSL is enabled. Its meaning is the same
    as for ssl.wrap_socket.

    The redirect_output parameter specifies whether any output (on both stdout and stderr) produced
    by this program should be sent to the debugger. 

    This function returns immediately after setting up the debugging server, and does not block program
    execution. If you need to block until debugger is attached, call ptvsd.wait_for_attach. The debugger
    can be detached and re-attached multiple times after enable_attach is called.
    """

 

def wait_for_attach():
    """If a PTVS remote debugger is attached, returns immediately. Otherwise, blocks until a remote
    debugger attaches to this process.
    """

 

def break_into_debugger():
    """If a PTVS remote debugger is attached, pauses execution of all threads, and breaks into the
    debugger with current thread as active."""

 

 


Jan 28, 2013 at 11:43 AM

I think this looks pretty much as expected.

A few points though:

* About threading support, the break_into_debugger() mentions that all threads will stopped. Perhaps you can clarify this in the documentation for "enable_attach"?

* The name "enable_attach" does exactly what it says, but I think it's a bit counter intuitive, since all existing packages call it "settrace". I strongly advise you name it so too.

* Perhaps a timeout value would be wise for the wait_for_attach()?

 

All in all, I say go for it! Good job guys!

Coordinator
Jan 28, 2013 at 7:01 PM
mwesterdahl76 wrote:

I think this looks pretty much as expected.

A few points though:

* About threading support, the break_into_debugger() mentions that all threads will stopped. Perhaps you can clarify this in the documentation for "enable_attach"?

* The name "enable_attach" does exactly what it says, but I think it's a bit counter intuitive, since all existing packages call it "settrace". I strongly advise you name it so too.

* Perhaps a timeout value would be wise for the wait_for_attach()?

 

All in all, I say go for it! Good job guys!

Thanks for the review!

Regarding naming, I'd prefer to keep an existing function name for the sake of clarity and discoverability for those people who haven't used the same in other IDEs (and I think that using the same name as sys.settrace is unfortunate, because the semantics differ a lot). But we can certainly add an alias for convenience of those used to it.

For threading behavior, it's definitely worth documenting it for enable_attach, now that I think of it, if only because there is a non-obvious catch there - you can only debug those threads that are started after you call it (and the one you've called it on, of course) - debugger won't even see the threads that were there before. While it probably won't affect many people, as normally enabling debugging is the very first thing you do in the script, someone can always get there and be surprised.

A good point about timeout when blocking, and easy to implement, too - definitely worth adding.

Feb 2, 2013 at 7:19 AM
Edited Feb 2, 2013 at 7:19 AM
Thanks for emailing me, I just looked over this. This is great! I have a couple of questions:

1) How was the decision to reverse the usual behavior made? I ask because while making the code being debugged the server has many benefits, it also has downsides: for example, remote servers are usually restricted in the ports they can have open.

2) Would it be possible to support debugging without modification to the code? I'd love to be able to use Visual Studio's click-on-the-sidebar interface to set, disable and clear breakpoints rather than use the ipdb-esque approach of sprinkling set_traces everywhere and having to insert debug-related imports and dependencies into server side code.

If some of this stuff is actually supported already, please disregard. Wonderful work! I've been waiting for progress on this feature request for a long time. :)
Coordinator
Feb 2, 2013 at 8:59 PM
AphexSA wrote:
1) How was the decision to reverse the usual behavior made? I ask because while making the code being debugged the server has many benefits, it also has downsides: for example, remote servers are usually restricted in the ports they can have open.
There were several reasons.

The first one is basic consistency with the existing VS C++/.NET remote debugging story - where you go to Debug -> Attach to Process, and type the address of the machine running msvsmon. We actually extend the same dialog, providing a new Python remote debugging transport alongside standard ones for msvsmon debugging. Obviously, this model requires a server to be running for VS to connect to.

The second is that connecting from VS to the debugged program means that the latter doesn't need to have any details that are specific to the machine running the debugger/IDE, such as its IP address. So you can enable attaching once, and have different people connect from various machines without mucking around with your script every time.

The last one is that server-as-script model allows attaching to a script that's already running (and is not blocked waiting to debugger), and supports detaching and re-attaching the debugger at will.

The issue with ports seems like it would apply equally to the machine running the IDE - and then, of course, there's the issue with NAT (and I would expect developer machines to be behind NAT more often than servers). Either way, in real world, all combinations of where ports are open and where they're not can come up. It is certainly possible to allow the script to connect to the IDE, instead of the other way around (indeed, we already do it that way for local debugging - it actually uses the same protocol, and connects using sockets). The tricky part is making the UI look right, since it would have to be our own, done completely from scratch, rather than just integrating into the Attach to Process dialog that's already there.
2) Would it be possible to support debugging without modification to the code? I'd love to be able to use Visual Studio's click-on-the-sidebar interface to set, disable and clear breakpoints rather than use the ipdb-esque approach of sprinkling set_traces everywhere and having to insert debug-related imports and dependencies into server side code.
You don't need to use ptvsd.break_into_debugger for breakpoints when debugging remotely - you can set them in the usual way, by clicking on the left margin (or via menus etc) - and they will light up once you attach; or you can set them after attaching, while the script is running. The function is there mainly to support the scenario when you want to break into debugger as soon as you attach (usually immediately after wait_for_attach). It is intentionally more generic than that, however, so that it can also be used in other cases, such as when you need a conditional breakpoint.

We do need some way to enable our trace hooks to get debugging to work, so some way to inject the debugging module is necessary. I suppose it could be done without explicit imports in the script being debugged if we add the ability to use ptvsd module as the startup script, taking various debugging parameters (interface, port, secret etc) as command line arguments, setting up debugging, and then invoking the actual script that you pass to it as the final argument. Is that close to what you had in mind for "debugging without modification to the code"?