2
Vote

_elementtypes triggers a copy.Error that is incorrectly reported as unhandled

description

This bug is for Python 2.x only. A similar problem with an unrelated root cause also exists for Python 3.x in PTVS 2.0 - it is fixed in the current dev builds and will be fixed in 2.1

Just like this poster, I'm getting the "un(shallow)copyable object of type <type 'Element'>" Error thrown when I import from nltk

[pminaev] Original description of problem on StackOverflow:

I am dealing with a very silly error, and wondering if any of you have the same problem. When I try to import pandas using import pandas as pd I get an error in copy.py. I debugged into the pamdas imports, and I found that the copy error is thrown when pandas tries to import this:
from pandas.io.html import read_html 
The exception that is throwns is:
un(shallow)copyable object of type <type 'Element'>
I do not get this error if I try to straight up run the code and not use the PVTS debugger. I am using the python 2.7 interpreter, pandas version 0.12 which came with the python xy 2.7.5.1 distro and MS Visual Studio 2012.

file attachments

comments

shalgrim wrote Jan 3 at 10:24 PM

Fwiw, this happens only in VS2013, not VS2012

pminaev wrote Jan 9 at 11:24 PM

I tried to reproduce it, but I am unable to hit this code path. My repro code is:
import pandas as pd
print('ok')
I'm using WinPython - that is basically vanilla Python 2.7.5, with pandas 0.12.0.

When running this code, whether it is done under debugger or not, I don't observe that exception raised at all. I found the library code which throws it - it's standard module copy, and the only place where it raises the exception with this exact error text is the following line (in Lib\copy.py):
Error("un(shallow)copyable object of type %s" % cls)
However, setting a breakpoint on this line and running the code with library debugging enabled indicates that this particular line is never hit at all.

pminaev wrote Jan 9 at 11:25 PM

Can you please try to do the same thing - set a breakpoint at the line that raises the error in copy.py - and then, when it hits that line in your repro, provide the call stack at that point?

shalgrim wrote Jan 10 at 3:59 PM

Sure, here's my call stack (Python 2.7):
    copy in copy line 94    Python
    <exec or eval> line 10  Python
    cElementTree module line 3  Python
    internals module line 21    Python
    __init__ module line 91 Python
    sentence_splitter module line 6 Python
    document module line 2  Python
    danforth_20130919 module line 11    Python
Those bottom three lines are from my code and they're just imports.
The relevant one is sentence_splitter, line 6, where I have:
from nltk.tokenize.punkt import PunktSentenceTokenizer

pminaev wrote Jan 10 at 6:51 PM

Aha, I can see the line of code that raises the exception being hit on your repro! But the actual problem is still not occurring - the exception is not reported (as expected, since it's caught above) and program runs to completion.

Can you tell more about your environment? i.e. which Python distro are you using, if any, and whether you did anything special when registering it with PTVS (like adding it as custom interpreter etc)?

shalgrim wrote Jan 13 at 9:11 PM

Hi pminaev,

I'm running Python 2.7.2, just the one you'd download from python.org for win64. I didn't do anything special to register it as a special interpreter. I also have DreamPie installed, but don't know why that would make a difference. As far as VS goes, I have VS2012 and VS2013 integrated shells installed. I don't get this problem on VS2012, just on VS2013. I'm attaching a screenshot of what extensions I have installed.

pminaev wrote Jan 16 at 7:23 PM

It may be another instance of the elusive exception issue that I keep trying to track down and debug...

Can you see if this still repros for you if you try using the most recent PTVS dev build instead of 2.0? There have been some fixes in exception code there - I can't think of why they would affect your repro, but then I can't think of why it would happen in the first place, and if we have already accidentally fixed it it would be good to know.

pminaev wrote Jan 16 at 8:55 PM

By the way, as a temporary workaround, you can disable breaking on this exception type in Debug -> Exceptions dialog (but this will also prevent breaking when a similar exception is genuinely unhandled).

shalgrim wrote Jan 23 at 6:57 PM

Working on installing latest dev build now. Do I need to uinstall the stable version first? How would I do that?

And that workaround doesn't work. There is no copy.Error option in the Debug -> Exceptions dialog, and I've unchecked the 'Break when this Exception type is thrown' option when it breaks there in the debugger and it still always breaks there.

Not too much trouble to hit F5 again, but I'll see what's going on in the latest dev build.

shalgrim wrote Jan 23 at 7:42 PM

I get the same behavior in the latest dev build.

shalgrim wrote Jan 23 at 7:44 PM

Pulled this out of PTVS Diagnostic Info FYI:
Environments:
Microsoft.PythonTools.Interpreter.ConfigurablePythonInterpreterFactoryProvider
Microsoft.PythonTools.Interpreter.CPythonInterpreterFactoryProvider
    Id: 9a7a9026-48c1-4688-9d5d-e5699d47d074
    Factory: Python 64-bit 2.7
    Version: 2.7

pminaev wrote Jan 23 at 7:49 PM

I think you should be able to just install over and it'll uninstall the old one automatically. Either way it's safe to uninstall manually before, since that's what the installer will be doing anyway. You can uninstall via Control Panel -> Programs and Features.

Yup, the Exceptions dialog has only the standard/builtin types listed by default. But you should be able to add your own via the "Add..." button on the right.

pminaev wrote Jan 31 at 7:46 PM

So basically anything that will cause the standard native module _elementtypes to be imported will cause this. The quickest way to repro is from xml.etree import cElementTree. It will only repro if the option to debug the Python standard library (in Tools -> Options -> Python Tools -> Debugging) is enabled.

pminaev wrote Jan 31 at 7:46 PM

A more detailed description of the problem:

Python doesn’t have the concept of first-pass and second-pass for exceptions the way most other languages do (where first pass walks the handlers and determines whether there is a matching one, and second pass actually unwinds the stack up until the handler). So at the point where the exception is raised, there’s no way to tell if it’s handled or not without letting it fly and seeing if it lands in a handler. So PTVS tries to figure it out by itself by looking at the call stack, and then parsing the source code corresponding to frames on that stack to see if there are any “except” handlers in places where they would lexically be able to catch the exception, and whether the type would match.

Now, in this particular case, at the point where the exception is raised in copy.py, there’s no except-block – the expectation is that the caller will handle it. There is an except-block in the caller, but the calling source code does not come from a file, but rather is just a string eval’d by a native module – that’s why you’re seeing that <eval or exec> frame in the stack. This is actually courtesy of the _elementtree module, which is a native C module backing cElementTree; here is the try-except statement in question.

Unfortunately, there’s no way for us to get to that code. Even if it were a pure Python module, by the time the eval’d code runs, all we have is a code object, which does not contain any reference to the source code used to produce this, other than the filename – which is missing in this case. So we don’t see the handler, and assume unhandled for the lack of anything telling otherwise.

Tthere are some things we can do here to make the experience better. In the absence of source code for a given frame, we could try parsing its bytecode to locate any except-handlers – unfortunately, that would be CPython-specific, and would require a full-fledged bytecode evaluator to get the caught exception type for an arbitrary expression (since it’s perfectly legal to write something like “except type(y)” or even “except foo.bar.baz(blah)”), so this isn’t something that we have the resources to do in this release. It still won’t solve scenarios like Python code throwing and native code catching, though. Ultimately, we can only do so much with present Python exception semantics, which don’t play well with the concept of caught/uncaught at the point of raise.

Short-term, I don’t really have any ideas other than trying to special-case specific instances of this pattern in the standard library where we know that exception is actually handled, e.g. by looking at the list of frames on the stack at the point where it throws. So, in this case, if we see:
copy
<eval or exec>
_elementtree
on the top of the stack, then it’s handled. And it could be an extensible whilelist described by a config file somewhere in PTVS install so that it could be tailored to your specific needs.