PTVS beta stuck at Refreshing DB

Jul 19, 2013 at 8:20 PM
Hi

I installed the beta version of the PTVS yesterday, and the database generation seems to be stuck. I have two interpreters (python 3.3, which is the default, and python 2.6). The Python 2.6 completion DB was generated quickly. However, it has been running for close to 7 hours, and the Python 3.3 progress bar still shows the DB as refreshing.

The Microsoft.PythonTools.Analyzer.exe *32 process currently takes up 600 MB (and rising) and 25% of the CPU. The progress meter in the Python Interpreters window has been stuck at about 80% for a while now. I have already terminated the process once and restarted the database generation.

The Python 3.3 interactive window does shows several of the options under Intellisense. However, the main code window does show the same options, even for objects of the same type.

Could you let me know what I should do next?

Thanks
Coordinator
Jul 19, 2013 at 11:07 PM
Are you sure the memory usage is rising? If so, it's still going - if the memory usage is steady then we have a serious issue. That said, it should not take that long in any case, so you may have a package installed that is giving us trouble. Have you got anything installed in your site-packages folder for Python 3.3?

The Interactive Window uses a live Python instance, which is why it can find things that aren't in the database yet. To avoid executing (potentially malicious) code while generating the database, we don't run Python, but it does complicate things a little.
Jul 20, 2013 at 5:31 AM
Hi

Thank you for the quick reply. Here is the status update:

1) The memory usage has been varying, but it hasn't been monotonic - for the last few hours, it has been moving up and down near 600 MB. After leaving it running all day, I got an message telling me the DB was corrupted, and had to be rebuilt (the peak memory usage had crossed 1 GB before the error - it had been steadily rising). When I let it rebuild, the memory usage has jumped around from 70MB to 600MB. For the last hour the the CPU usage is steady around 25%. Once again, the progress bar has settled near 80%.

2) In the site packages folder I have several subfolders: pycache, Cython (and cython_gsl), mysql,, numpy, pyximport, and scipy.

I'm not sure if this is relevant, but the PythonTools analyzer is currently running at below_normal priority. Also, I am running VS2012, and I am using a 64-bit computer (and using PTVS 2.0 beta for VS2012). If you would like any further information, please let me know. Since Cython_gsl is not installed in Python 2.6 (which successfully had a DB built), I'll remove it and try again. I'll update the results in the morning.

Thanks again, and keep up the good work.
Coordinator
Jul 20, 2013 at 2:44 PM
Cython_gsl is worth removing - it's the only one I haven't tested recently. If that still doesn't fix it, can I get you to run a test with extra logging? (Requires setting a registry key and uploading a ~100MB text file somewhere.)
Jul 20, 2013 at 4:00 PM
Hi

I have tried removing Cython_gsl, and there has been no change (I removed the GSL file, uninstalled PyTools, reinstalled it, and tried again). The progress is stuck again at about 80%, and after 9 hours the memory usage is about 1.5GB (processor utilization: 25%).

Also, when I said I removed Cython_gsl, I couldn't find the uninstaller, so I deleted the directory from my site packages folder.

Please let me know how to enable the logging. Also, I'm not sure where the database is created (in case I need to delete it, before running the logging test).

I also have the following extensions, apart from PTVS: VsVim, and VS Tools for Git. I had disabled them as well, but there was not change (so I re-enabled them).

Thank you
Jul 20, 2013 at 7:46 PM
I am also experiencing a hang with the database refresh and PTVS 2.0 beta. I am using 64 bit 2.7 Python under Windows 7, with VS 2012.

My intellisense is crippled as a result and only sometimes works. Please advise if you have encountered this or know what to do.
Coordinator
Jul 21, 2013 at 12:16 AM
Edited Jul 21, 2013 at 12:19 AM
Deleting the directory from site-packages is normally enough, and certainly enough to prove that Cython_gsl is not the problem here.

The database is created in C:\Users\<yourname>\AppData\Local\Python Tools\<VS version>\<guid>\<Python version>. If you find the right Python version folder you can just delete that one, or you can just delete the whole thing (and regenerate the other databases). There should be no need to delete it, however, and I'd prefer if you didn't because it might fix the problem (which will mean I can't figure out how to fix the other problem... you can delete it later :-) )

To enable the extra logging, first make sure the analyser process is not running. Then you'll have to find or create the following registry key:
HKEY_CURRENT_USER\Software\Microsoft\VisualStudio\12.0\PythonTools\Analysis\StandardLibrary
(You haven't mentioned which version of VS you have, so 12.0 may need to be 10.0 (for VS 2010) or 11.0 (for VS 2012).)

Inside this key, create a string value called LogPath and set it to a path to a file anywhere your user account can access.
(For example, C:\Users\<yourname>\Desktop\Log.csv.)

From VS, start analysing again, and leave it until it gets stuck. The log file could get very large, especially if you leave it stuck for a long time, so try not to let it sit for hours - those packages should take less than 10 minutes to analyse completely. You'll have to kill the process.

Once you've done all this, you're best to compress the file as much as possible and upload it somewhere (preferably somewhere we don't have to create an account to access) - you can send a public-access link directly to ptvshelp@microsoft.com and we'll let you know once we've copied it.

Thanks for doing this - it's a big help towards making the analyser more stable.

Edit: Also, remember to delete that registry key once you're done. Each analysis will overwrite it (rather than appending to it...), but it still adds overhead and wastes disk space.
Coordinator
Jul 21, 2013 at 12:21 AM
bgale - best to start with telling us what version of Python you have, giving a list of what packages you have installed and where you got them from. We test with a lot of popular ones, but not all of them, so it's best if we find out which ones we need to add.

If you want to narrow things down yourself, you can try removing packages until analysis completes.
Jul 21, 2013 at 3:28 AM
Hi

Thank you for the information. I am using VS2012, and I sent you the log file links. The subject of the email is "PTVS log files", and comes from "my-discussion-name"@live.com. Please let me know if I can be of further help.

Thanks a lot
Jul 21, 2013 at 5:05 AM
I just noticed that I have a mysql and pximport subfolder in Python 3.3 as well. I'm not sure if either of these is the cause - I am adding a package at a time and refreshing the database.
Jul 21, 2013 at 4:55 PM
A follow up: Just running the DB refresh with Numpy (my first test case) causes the DB to hang at the 80% mark.
Coordinator
Jul 22, 2013 at 3:53 PM
Sounds like numpy is the issue then. I haven't received the log files yet, so I guess something is going wrong with our mailing list (again...).

Do you know exactly what version of numpy you have installed? And where did you get it from?
Coordinator
Jul 22, 2013 at 5:15 PM
Okay, we've found your logs and I've had a look. Thanks for the detailed notes.

The memory drops you noticed are normal. We analyse each package individually, so when we finish one we can reclaim the memory. It does mean we'll miss cross-package dependencies, but if we don't do this then we run out of memory in seconds.

Now, the last log you included is interesting, because it looks like we've just finished pyximport but haven't started the next one. Analysis has only been running for 24 minutes at that point, so it must have been stuck between packages. It looks like numpy was analysed correctly, so now I'm more interested in your version of scipy (or whichever package is alphabetically after pyximport), since that does not appear anywhere in the log.

Same questions as I just asked for numpy: what version do you have and where did you get it from? (The aim is to reproduce the behavior internally with a debug build. Unfortunately, there's no good reason for the analyzer to be stuck where it is...)
Jul 22, 2013 at 5:53 PM
I'll ran the refresh DB again - if the scipy package was included, the DB refresh hung (it worked with just numpy).
I am using scipy 0.12.0 for python 3.3 (32 bit version, for compatibility with the MinGW compiler). If I recall correctly, I got it from the unofficial binaries website: http://www.lfd.uci.edu/~gohlke/pythonlibs/#scipy
Coordinator
Jul 22, 2013 at 6:03 PM
I wonder if this has anything to do with the native module issue (https://pytools.codeplex.com/workitem/731). The problem there was that we mistakenly treated .pyd files as Python source in the analyzer and tried to parse and analyze them. This results in very large AST with lots of garbage and error nodes in it, and significantly increases time to analyze for PyQt. SciPy should also have plenty of native modules, so it may be running into the same issue.
Coordinator
Jul 22, 2013 at 6:34 PM
Yes, that's exactly the issue. SciPy in particular has some very large native modules, which take a long time to parse. I'll leave mine running so I can see how long it will end up taking

The best workaround here is to temporarily remove the .pyd files from scipy's directory, run the analysis, then put them back. If you replace them with empty .py files with the same name (apart from the extension) while you run analysis, then we probably won't prompt you to re-run analysis later. You won't get any completions for those files, but it looks like we don't find anything in them anyway.

This issue has been fixed in our next version. When I ran a test with the latest code the whole thing completed in only a few minutes (that's all of Python 3.3, plus most of the libraries mentioned here, in a debug build with extra logging - we've made some speed improvements :-) ). If you want to get hold of these changes sooner, you can build from source.
Jul 22, 2013 at 6:40 PM
Hi

Thank you for all your effort. I guess I'll try the fix of removing the pyd files, and wait for the announcement of the new release.

Once again, thanks for the great tool and support.
Aug 10, 2013 at 6:26 AM
Hello,

I had a similar problem. The "Refresh DB" button stalled at about 80%. This is with Visual Studio 2012 Update 3, PTVS 2.0 beta, and Python 3.3 x64 on Windows 7.

Numpy was the issue.

Workaround:
  1. Close Visual Studio.
  2. Ensure that the Analyzer process isn't running (control+shift+escape)
  3. Temporarily rename the pyd files in the numpy site-packages directory. Keep this command prompt open.
cd C:\Python33\Lib\site-packages\numpy
for /R %f in (*.pyd) do @rename %~f %~nf.pyd.bak
  1. Open Visual Studio. Start the DB Refresh.
  2. Once complete, rename the pyd files back to their original name.
for /R %f in (*.pyd.bak) do @rename %~f %~nf
Hope this is helpful.
Coordinator
Aug 12, 2013 at 11:30 PM
Just a quick note that this bug has been fixed internally & will be available in the RC in about 3 weeks. You can also build from sources to pick it up.
Aug 18, 2013 at 1:00 PM
Edited Aug 18, 2013 at 1:01 PM
Hi Ptools,
Any chance you can please upload an unofficial binaries with this bug fixed and save us the trouble of compiling the PTVS?
Thanks!
Hanan.
Coordinator
Aug 25, 2013 at 10:02 PM
Edited Aug 26, 2013 at 4:25 AM
thanans - we're literally going into escrow in the next couple of days. RC bits will be available a week or so after that.

the good news is that we've been working hard w our legal dept to convince them to let us make weekly builds available. so sometime post RC, we'll start pushing sources & bits up to codeplex on a regular basis - no more waiting for beta/RC/RTM bits if you need a fix now, early access, etc.
Sep 11, 2013 at 9:48 PM
Just verified this problem has been fixed in RC1. Thanks for this great tool, PTVS team!
Sep 20, 2013 at 11:52 AM
Same problem with scons for me
Sep 21, 2013 at 2:06 AM
I don't think this problem has been fixed in RC. I installed RC today and think am running into this issue. Microsoft.PythonTools.Analyzer.exe has been running for about an hour now. It is consuming 50% of my CPU. I have even closed VS. Mem usage is 432MB currently.
Coordinator
Sep 21, 2013 at 1:45 PM
That sounds right, depending on where you got Python from. If you have a lot of packages it will take longer, and we've seen it take a couple of hours for some big distributions (I think that was pythonxy - it's not an issue with the distro, just that it comes with a lot of libraries).

Look in VS at the Python Environments window (View, Other Windows, Python Environments) and see if the progress bar is moving. If so, it's working fine. If not, mouse over the progress bar and tell us what the tooltip says.
Jan 28 at 9:34 PM
Unfortunately this issue has come up for me also. It's not a big problem as it's the tools are working but there isn't much point using Visual Studio if code completion doesn't work, may as well use a more lightweight text editor.
Coordinator
Jan 28 at 11:22 PM
I assume this is occurring in our final 2.0 release for you and not in beta? Are you able to tell us what packages you have installed (pip list output would be perfect) - it's possible that you have one we haven't tested or one has changed.
Wed at 2:37 PM
I just ran into this issue again with 2.1 RC. The package it is stuck at scanning is kivy (see http://kivy.org). I am using an Anaconda environment with an install of kivy (https://github.com/kivy/kivy/wiki/Using-Kivy-with-an-existing-Python-installation-on-Windows-%2864-or-32-bit%29).

The scan gets stuck always when I restart it. There is no increase in memory or CPU activity (for hours now).
Coordinator
Wed at 7:00 PM
What versions of kivy and Python are you using? These sort of issues are generally due to the code in the package, so using another version may be more successful, but without knowing exactly which version you have there's no way I can investigate.
Thu at 3:53 PM
Dear Zooba,

thanks for your quick response! I am using Python 2.7.5 64-bit with kivy 1.8.0 64-bit (in my case installed via the precompiled binary by Christoph Gohlke http://www.lfd.uci.edu/~gohlke/pythonlibs/#kivy).

I am sure you considered it, but is there any chance of adding a 'skip package' option or something alike to work around the scan for packages like that (or a timeout for the scanner)?

Best regards!
Coordinator
Fri at 10:35 PM
I'm still not able to reproduce this, though I'm not using Python 2.7.5 any more (2.7.8 has been released, so if you can upgrade to that version, it might be worth it).

It's possible that there are interactions between packages that I don't have installed - can you send the output of pip list or a list of all the files and folders in your C:\Python27\Lib\site-packages directory?

As for the last two suggestions, we don't actually have those options, though there is a way we could expose it (probably not for 2.1 at this stage). It's never going to be an obvious option though, as we like hearing about problems like this. There are hundreds of thousands of packages out there, and we can't test them all, so we rely on this feedback to help us find our bugs :)