Cannot scrape pyd file

Feb 27 at 7:01 PM
I read the previous thread about a similar topic, but it does not apply to me. The pyd that cannot be scraped is not made by me, but it is part of the shapely library that comes with python in OSGeo. The problem in the IDE is that it permanently says that Completion DB needs refresh and it's not a permissions problem since I run VS elevated. The ? says
Database at C:\Users\User\AppData\Local\Python Tools\CompletionDB\12.0\2f139a99-a95e-4e5d-bf23-161306b96da0\2.7 does not contain the following modules:
shapely.speedups._speedups
shapely.speedups._speedups
In the AnalysisLog, I get this error:
2014-02-26T22:05:45: Scraping shapely.speedups._speedups
2014-02-26T22:05:45: Command: C:\OSGeo4W\bin\python.exe "C:\Program Files (x86)\Microsoft Visual Studio 12.0\Common7\IDE\Extensions\Microsoft\Python Tools for Visual Studio\2.0\ExtensionScraper.py" scrape shapely.speedups._speedups C:\OSGeo4W\apps\Python27\Lib\site-packages\Shapely-1.2.18-py2.7-win32.egg "C:\Users\User\AppData\Local\Python Tools\CompletionDB\12.0\2f139a99-a95e-4e5d-bf23-161306b96da0\2.7\site-packages_Shapely-1_2_18-py2_7-win32_egg \shapely.speedups._speedups"
2014-02-26T22:05:45: [WARNING] Errors
Traceback (most recent call last):
File "C:\Program Files (x86)\Microsoft Visual Studio 12.0\Common7\IDE\Extensions\Microsoft\Python Tools for Visual Studio\2.0\ExtensionScraper.py", line 66, in <module>
PythonScraper.write_analysis(output_path, analysis)
File "C:\Program Files (x86)\Microsoft Visual Studio 12.0\Common7\IDE\Extensions\Microsoft\Python Tools for Visual Studio\2.0\PythonScraper.py", line 964, in write_analysis
out_file = open(out_filename + '.idb', 'wb')
IOError: [Errno 2] No such file or directory: 'C:\Users\User\AppData\Local\Python Tools\CompletionDB\12.0\2f139a99-a95e-4e5d-bf23-161306b96da0\2.7\site-packages_Shapely-1_2_18-py2_7-win32_egg \shapely.speedups._speedups.idb'
2014-02-26T22:05:45: [ERROR] Failed to scrape shapely.speedups._speedups (Exit code: 1)
In the speedups folder there are these files:
__init__.py
__init__.pyc
_speedups.c
_speedups.py
_speedups.pyc
_speedups.pyd
If I remove _speedups.pyd from this folder everything is fine, but even though I don't think I'll need Intellisense from _speedups.pyd, it really bugs me that I have this error.

I also thought that maybe there's an error with the pyd and I tried to change it with a newer version, but it didn't work. Any help to debug this would be appreciated.
Coordinator
Feb 27 at 7:26 PM
The path we're trying to write the file to has a strange space in it:
...\site-packages_Shapely-1_2_18-py2_7-win32_egg \shapely.speedups._speedups"
                                                ^ here
We take that part of the name from the directory the package is installed into. Can you check whether there is a space at the end of the Shapely directory? If there is, you should be able to fix it immediately by removing the space. We can also filter it out at our end, though that won't help you until our next release.

If there isn't a space in the source path, let me know and I'll see how we may be introducing one.
Feb 27 at 8:43 PM
Actually, I also noticed the space, but there's no space in the actual path
C:\OSGeo4W\apps\Python27\Lib\site-packages\Shapely-1.2.18-py2.7-win32.egg\shapely\speedups

As you can see, there should be 2 things scraped, _speedups.py and _speedups.pyd. The first one gets scraped just fine, only with the pyd there's a problem. I also don't really understand how you can distinguish between these two.
Coordinator
Feb 27 at 9:43 PM
We don't distinguish between the two - _speedups.pyd should take priority and the .py should be skipped (it would be analyzed later in the process, but is already filtered out because of the .pyd). Python also uses the same ordering, so the behaviour should be identical to what really happens.

I'll add a fix to replace spaces with underscores (we already do this for other characters), which should be in our next release (and dev build). Hopefully that will resolve this, though since the original path does not have a space at the end I'm at a bit of a loss to explain where it came from.
Feb 28 at 6:36 AM
The thing is that if I get rid of the pyd then it works just fine and
_speedups.py actually gets scraped. So should I be fine if I just
remove it?
I don't think getting rid of the space will solve the problem because
I tried to run the script from the command line without the space and
it gives me this:

__import__("shapely.speedups._speedups")
Traceback (most recent call last):
File "C:\Program Files (x86)\Microsoft Visual Studio
12.0\Common7\IDE\Extensions\Microsoft\Python Tools for Visual
Studio\2.0\ExtensionScraper.py", line 51, in <module>
PythonScraper.write_analysis(output_path, {"members": {}, "doc":
"Could not import compiled module"})
File "C:\Program Files (x86)\Microsoft Visual Studio
12.0\Common7\IDE\Extensions\Microsoft\Python Tools for Visual
Studio\2.0\PythonScraper.py", line 964, in write_analysis
out_file = open(out_filename + '.idb', 'wb')
IOError: [Errno 2] No such file or directory: 'C:\Users\User\AppData\Local\
Python Tools\CompletionDB\12.0\2f139a99-a95e-4e5d-bf23-161306b96da0\2.7\sit
e-packages_Shapely-1_2_18-py2_7-win32_egg\shapely.speedups._speedups.idb'

Which is not very informative. Could it be something wrong with the
actual pyd file?
I'm quite new to python and I don't really even understand why do I
also have a pyc and a pyd. In the original package there's no .py or
.pyc only the .c and the .pyd. Isn't the pyc like an exe and pyd like
a dll? I think the py is only a wrapper that loads the pyd, therefore
there's no need to have the py scraped.
The py for sure doesn't correspond to the pyd.
Do you have any idea of what should I investigate further?
Coordinator
Feb 28 at 3:57 PM
There's certainly something wrong with the .pyd file - that's the "Could not import compiled module" message. Unfortunately, we generally can't be more specific than that. Often this is a 32-bit/64-bit issue (between python.exe and the .pyd) but it can be any variety of corruption that prevents Windows from loading the DLL/pyd. A quick search indicates a few known issues with OSGeo4W and shapely that may be worth looking into. Meanwhile, I'll set it up and see if I can figure out

The "No such file or directory" error means that this directory does not exist: C:\Users\User\AppData\Local\Python Tools\CompletionDB\12.0\2f139a99-a95e-4e5d-bf23-161306b96da0\2.7\site-packages_Shapely-1_2_18-py2_7-win32_egg. We normally create this before launching ExtensionScraper.py - there's no code in that script to create it. I can't really suggest anything other than that the directory does not exist. While Windows doesn't support spaces at the end of directory names, there are ways to force it, but Explorer will still hide them. Deleting and recreating the directory may be necessary.

I had one idea about where the space may be coming from. In your C:\OSGeo4W\apps\Python27\Lib\site-packages folder there will be a range of *.pth files. One of these will have a line that specifies the path to Shapely. If this is a full path and there is a space at the end, then that is where we are getting the space from. Removing the space at the end of this line should resolve the issue. (Typically these files use relative paths.) Please let me know if this is the case, though I'll fix the issue anyway.

As for the files, .py is the Python source file. In this case, it only exists to handle the case where the .pyd is missing. A .pyd is a DLL written in C that can be imported in Python. The .c file is probably the source code to the .pyd - Python libraries are often distributed as source code so that users can compile them optimized for their own systems. (On Windows, where there is unlikely to be a compiler available, people will ship them precompiled. On the other hand, libraries are more reliable on Windows, so this can actually work.)

A .pyc file is a temporary file containing the compiled .py file. Compiling is fast, so Python does it when necessary rather than making the developer do it up front, but it's faster to load an already compiled version. If the .pyc exists and is newer than the .py, it will be loaded. Otherwise, the .py is recompiled and the .pyc is updated. (A .pyo file is similar, but gets created when you run python.exe -O or -OO. This will optimize it slightly, but most people don't bother with it.)
Mar 1 at 8:58 PM
Yes, it was the shapely.pth file. There was an extra space in there. Now everything seems to be fine. Thank you!