1

Closed

cpython completion database failing to build

description

My completion database fails to build. It starts indexing, processes a couple of directories, successfully deals with the standard library, but when it comes to site_packages, it will hang there for a few minutes, and then terminate indexing without any noticeable error message; when the overall progress bar is still at about only 10%, and even the packages that were supposedly indexed and saved, still don't have autocompletion. 'completion bd needs refresh' is still visible as ever.

Win7 64bit, msvs2013 pro, latest python tools, cpython 2.7 64bit.

The ironpython completion db builds without trouble. I suppose something in site-packages trips up the indexing; but itd be nice if the indexer couldn't be fooled by a single nonconforming package, but would simply skip that one instead of terminate...

Any way I could access an error log?
Closed Jan 8 at 11:22 PM by Zooba

comments

EHoogendoorn wrote Dec 20, 2013 at 7:06 PM

For the record; repairing both my msvs2013 and pytools install does not change anything (they were completely fresh installs anyway).. Perhaps noteworthy: after parsing and saving the standard library, it happily goes back to parsing the standard library again. Is that expected, or may that be related to what is going on?

Zooba wrote Dec 20, 2013 at 8:43 PM

Your initial analysis is probably spot on. If you open Tools-Python Tools-Diagnostic Info then you'll get the full logs. You can either look for error messages yourself or email it to ptvshelp@microsoft.com. A pip freeze or listing of your site-packages directory would also help, though there's probably enough info in the log.

Chances are this is an out of memory error, which typically means we've encountered some Python code that is putting our analyzer into a loop. If so, the only fix will be to remove the package - there's no way for you to diagnose the actual cause, but if you can identify the package and version, I can try and get it fixed or come up with a change that will fix us up.

We don't have any way to skip a package right now, but it should be easy to add, so I'll make an issue to add that. The long term will see this step significantly improved, but right now we're focused on other tasks. We're also largely on holiday at this point, so responses may be slower than usual.

EHoogendoorn wrote Dec 21, 2013 at 12:30 PM

Thanks; indeed you are correct about the outofmemoryexception. The log does not appear to give me any handle as to what to do with them though. I get the traceback of where things went wrong in the analyzer code; but no apparent indication of the arguments with which this code failed.

...lots of stuff here....
2013-12-19T23:44:29: Analyzing "_pytest.assertion.rewrite"
2013-12-19T23:44:29: Analyzing "_pytest.assertion.util"
2013-12-19T23:44:29: Analyzing "_pytest.assertion"
2013-12-19T23:44:29: Starting analysis of 14894 modules
2013-12-19T23:44:45: [ERROR] Analysis failed
System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
at System.Collections.Generic.HashSet1.SetCapacity(Int32 newSize, Boolean forceNewHashCodes)
at System.Collections.Generic.HashSet
1.IncreaseCapacity()
at System.Collections.Generic.HashSet1.AddIfNotPresent(T value)
at System.Collections.Generic.HashSet
1.Add(T item)
at Microsoft.PythonTools.Analysis.HashSetExtensions.AddValue[T](ISet1& references, T value)
at Microsoft.PythonTools.Analysis.Values.FunctionInfo.AddReference(Node node, AnalysisUnit unit)
at Microsoft.PythonTools.Analysis.Analyzer.ExpressionEvaluator.EvaluateName(ExpressionEvaluator ee, Node node)
at Microsoft.PythonTools.Analysis.Analyzer.ExpressionEvaluator.EvaluateWorker(Node node)
at Microsoft.PythonTools.Analysis.Analyzer.ExpressionEvaluator.EvaluateCall(ExpressionEvaluator ee, Node node)
at Microsoft.PythonTools.Analysis.Analyzer.ExpressionEvaluator.EvaluateWorker(Node node)
at Microsoft.PythonTools.Analysis.Analyzer.DDG.Walk(AssignmentStatement node)
at Microsoft.PythonTools.Parsing.Ast.AssignmentStatement.Walk(PythonWalker walker)
at Microsoft.PythonTools.Analysis.Analyzer.DDG.Walk(SuiteStatement node)
at Microsoft.PythonTools.Parsing.Ast.SuiteStatement.Walk(PythonWalker walker)
at Microsoft.PythonTools.Parsing.Ast.PythonAst.Walk(PythonWalker walker)
at Microsoft.PythonTools.Analysis.AnalysisUnit.AnalyzeWorker(DDG ddg, CancellationToken cancel)
at Microsoft.PythonTools.Analysis.Analyzer.DDG.Analyze(Deque
1 queue, CancellationToken cancel, Action`1 reportQueueSize, Int32 reportQueueInterval)
at Microsoft.PythonTools.Analysis.PythonAnalyzer.AnalyzeQueuedEntries(CancellationToken cancel)
at Microsoft.PythonTools.Analysis.PyLibAnalyzer.Analyze()
at Microsoft.PythonTools.Analysis.PyLibAnalyzer.Main(String[] args)

EHoogendoorn wrote Dec 21, 2013 at 1:07 PM

2013-12-20T11:46:35: [ERROR] Error parsing "twisted.conch.ssh.userauth" "c:\python27\lib\site-packages\twisted\conch\ssh\userauth.py"
System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.

Lots of complaints about twisted, so removed it; but to little avail. Ive deleted the cached directories as well, but rerunning now gives me only this as output in analysislog.txt:
FAIL_STDLIB: (-3) "C:\Program Files (x86)\Microsoft Visual Studio 12.0\Common7\IDE\Extensions\Microsoft\Python Tools for Visual Studio\2.0\Microsoft.PythonTools.Analyzer.exe" /id {9a7a9026-48c1-4688-9d5d-e5699d47d074} /version 2.7 /python C:\Anaconda\python.exe /library C:\Anaconda\lib /outdir "C:\Users\Eelco\AppData\Local\Python Tools\CompletionDB\12.0\9a7a9026-48c1-4688-9d5d-e5699d47d074\2.7" /basedb "C:\Program Files (x86)\Microsoft Visual Studio 12.0\Common7\IDE\Extensions\Microsoft\Python Tools for Visual Studio\2.0\CompletionDB" /log "C:\Users\Eelco\AppData\Local\Python Tools\CompletionDB\12.0\9a7a9026-48c1-4688-9d5d-e5699d47d074\2.7\AnalysisLog.txt" /glog "C:\Users\Eelco\AppData\Local\Python Tools\CompletionDB\12.0\AnalysisLog.txt" /wait ""
This interpreter is already being analyzed.
2013-12-21T14:48:42 START_STDLIB "C:\Program Files (x86)\Microsoft Visual Studio 12.0\Common7\IDE\Extensions\Microsoft\Python Tools for Visual Studio\2.0\Microsoft.PythonTools.Analyzer.exe" /id {9a7a9026-48c1-4688-9d5d-e5699d47d074} /version 2.7 /python C:\Anaconda\python.exe /library C:\Anaconda\lib /outdir "C:\Users\Eelco\AppData\Local\Python Tools\CompletionDB\12.0\9a7a9026-48c1-4688-9d5d-e5699d47d074\2.7" /basedb "C:\Program Files (x86)\Microsoft Visual Studio 12.0\Common7\IDE\Extensions\Microsoft\Python Tools for Visual Studio\2.0\CompletionDB" /log "C:\Users\Eelco\AppData\Local\Python Tools\CompletionDB\12.0\9a7a9026-48c1-4688-9d5d-e5699d47d074\2.7\AnalysisLog.txt" /glog "C:\Users\Eelco\AppData\Local\Python Tools\CompletionDB\12.0\AnalysisLog.txt" /wait ""
2013-12-21T14:53:10 FAIL_STDLIB
System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
   at System.Text.StringBuilder..ctor(String value, Int32 startIndex, Int32 length, Int32 capacity)
   at Microsoft.PythonTools.Intellisense.Unpickle.FileInput.Read(Int32 size)
   at Microsoft.PythonTools.Intellisense.Unpickle.UnpicklerObject.Read(Int32 size)
   at Microsoft.PythonTools.Intellisense.Unpickle.UnpicklerObject.Load()
   at Microsoft.PythonTools.Interpreter.Default.CPythonModule.EnsureLoaded()
   at Microsoft.PythonTools.Interpreter.Default.CPythonModule.GetMember(IModuleContext context, String name)
   at Microsoft.PythonTools.Analysis.Values.BuiltinNamespace`1.GetMember(Node node, AnalysisUnit unit, String name)
   at Microsoft.PythonTools.Analysis.Values.BuiltinModule.GetMember(Node node, AnalysisUnit unit, String name)
   at Microsoft.PythonTools.Analysis.Values.BuiltinModule.GetModuleMember(Node node, AnalysisUnit unit, String name, Boolean addRef, InterpreterScope linkedScope, String linkedName)
   at Microsoft.PythonTools.Analysis.Analyzer.DDG.WalkFromImportWorker(NameExpression node, IModule userMod, String impName, String newName)
   at Microsoft.PythonTools.Analysis.Analyzer.DDG.Walk(FromImportStatement node)
   at Microsoft.PythonTools.Parsing.Ast.FromImportStatement.Walk(PythonWalker walker)
   at Microsoft.PythonTools.Analysis.Analyzer.DDG.Walk(SuiteStatement node)
   at Microsoft.PythonTools.Parsing.Ast.SuiteStatement.Walk(PythonWalker walker)
   at Microsoft.PythonTools.Analysis.Analyzer.DDG.Walk(IfStatement node)
   at Microsoft.PythonTools.Parsing.Ast.IfStatement.Walk(PythonWalker walker)
   at Microsoft.PythonTools.Analysis.Analyzer.DDG.Walk(SuiteStatement node)
   at Microsoft.PythonTools.Parsing.Ast.SuiteStatement.Walk(PythonWalker walker)
   at Microsoft.PythonTools.Parsing.Ast.PythonAst.Walk(PythonWalker walker)
   at Microsoft.PythonTools.Analysis.AnalysisUnit.AnalyzeWorker(DDG ddg, CancellationToken cancel)
   at Microsoft.PythonTools.Analysis.Analyzer.DDG.Analyze(Deque`1 queue, CancellationToken cancel, Action`1 reportQueueSize, Int32 reportQueueInterval)
   at Microsoft.PythonTools.Analysis.PythonAnalyzer.AnalyzeQueuedEntries(CancellationToken cancel)
   at Microsoft.PythonTools.Analysis.PyLibAnalyzer.Analyze()
   at Microsoft.PythonTools.Analysis.PyLibAnalyzer.Main(String[] args) "C:\Program Files (x86)\Microsoft Visual Studio 12.0\Common7\IDE\Extensions\Microsoft\Python Tools for Visual Studio\2.0\Microsoft.PythonTools.Analyzer.exe" /id {9a7a9026-48c1-4688-9d5d-e5699d47d074} /version 2.7 /python C:\Anaconda\python.exe /library C:\Anaconda\lib /outdir "C:\Users\Eelco\AppData\Local\Python Tools\CompletionDB\12.0\9a7a9026-48c1-4688-9d5d-e5699d47d074\2.7" /basedb "C:\Program Files (x86)\Microsoft Visual Studio 12.0\Common7\IDE\Extensions\Microsoft\Python Tools for Visual Studio\2.0\CompletionDB" /log "C:\Users\Eelco\AppData\Local\Python Tools\CompletionDB\12.0\9a7a9026-48c1-4688-9d5d-e5699d47d074\2.7\AnalysisLog.txt" /glog "C:\Users\Eelco\AppData\Local\Python Tools\CompletionDB\12.0\AnalysisLog.txt" /wait ""
hmm... maybe I should try a clean python install?

Zooba wrote Dec 21, 2013 at 6:58 PM

It's not easy to diagnose these. The arguments in the stack trace are pretty much irrelevant, since there's a loop near the top and the OOM can happen at any time.

The strange thing here is that I don't think pytest has that many modules, which probably means a path in site-packages has been messed up. Seeing the full list of modules leading up to the error will help.

To keep diagnosing this, have a look in any *.pth files in site-packages and whether they reference a directory with those modules. There might be stray __init__.py files under site-packages, that make us treat a directory as a package when we should be ignoring it. If this is a custom environment, make sure the library path is "...\Lib" and not "...\site-packages".
Otherwise, you can post the list of modules leading up to the crash and I'll give some more specific directions.

EHoogendoorn wrote Dec 21, 2013 at 8:28 PM

Yeah I had worried about the .pth files; I have two linked python installs. But I figured that wasn't the problem, since the 'root' python install, which does not link to any other, failed to index just the same.

However, I noticed that the root python has a setuptools.pth containing a reference to the directory it is in. Given my understanding of .pth files, this seems pointless, and I can easily imagine, fatal for code that does not guard against it. Indexing of the dependent python install still fails; but after removing this .pth the root python now indexes just fine (ahhh.... np.array works as expected; the litmus test for the many wannabe code completion tools in python)

However, the linking install still does not work. I don't see anything suspicious in the .pth files anymore. There is some black magic going on in matplotlib.pth; maybe that's it?

EHoogendoorn wrote Dec 21, 2013 at 9:15 PM

unlinking the root python allows the secondary pythyon install to index successfully as well.

Subsequently relinking the installs does not give me completion on both, however. It tries to rebuild again; and fails. I suppose the many duplicate packages in the two installs must be tripping it up?

Could I be using the virtualenv features to archieve the desired effect, rather than my manual linkage? The reason for this configuration I have is that both python distributions contain complementary bundled packages that are near impossible to install manually.

Zooba wrote Dec 21, 2013 at 10:48 PM

Ah yes I think I fixed the self-referencing pth file issue a while back (after 2.0). Virtual environments may help you, but we have an assumption that they are project specific, which makes them slightly awkward to share between various projects, if that's what you mean to do.

Magic in .pth files is ignored, so some completions may be missed, but they shouldn't break us. What you may need to do is add a .pth file that links to each package independently, rather than to get all of them with one path. It could be a pain to maintain, but we don't really have an alternative workaround. Duplicate packages shouldn't be a problem, we filter quite early based on the import name.

EHoogendoorn wrote Dec 22, 2013 at 8:10 AM

It surprised me you say duplicate packages should be no problem. To recap: both installs are now indexing fine, when not linked to eachother. Merely linking the one working install to the other breaks things. What else could be the problem?

Indeed, ive thought about only linking in the packages of interest. I may give that a try, I should have to do it only once.

Zooba wrote Dec 22, 2013 at 8:34 PM

They won't be a problem, and the filtering we do probably had a 50/50 chance of preventing your issue.

I'm pretty sure now that your problem is the link going to the site-packages folder and not each package. When we grab a path out of a .pth file, we assume all the files within it are related and analyze them as a group. (If directories in site-packages are importable, we assume that each one is independent.) This helps make sure that the analysis is relevant without taking too long.

When you link to the one folder, we start analysing all the libraries in there together, where we'd normally do them separately. With less files, this might be okay, but it seems you've hit the limit. Unfortunately it's not a predictable limit, which is why we can't warn or prevent it, though we could probably come up with better reporting. Not sure if that would have helped you though, since we haven't seen your setup before.

If enough modules were duplicated and we saw the "local" ones first, we should have skipped them in the linked library. Since that's order dependent and there are few guarantees here, that's why it didn't help. (Or maybe it did and you just don't have enough duplicates.)