[MRG] Poincare model keyedvectors#1700
Conversation
…nce and vector_distance_batch
|
Nice catch for the integer division, fixed. |
|
Short summary and rationale for the changes to
The PR also adds missing tests for some of the older |
004b572 to
73ed696
Compare
menshikh-iv
left a comment
There was a problem hiding this comment.
In general - very nice work 💣 🔥
Only several questions (and I'll fix docstrings / PEP8 after your commits).
| @@ -0,0 +1,187 @@ | |||
| #!/usr/bin/env python | |||
There was a problem hiding this comment.
I think better write this code in ipython notebook (or maybe move to docs/notebooks and import from "under the feet").
gensim.models isn't a suitable place for this file.
There was a problem hiding this comment.
I agree gensim.models isn't a good place for it. I don't want to have it in an ipython notebook or docs/notebooks though since that would mean a user can't import and use it, and I think it is definitely useful for a user. Do you think creating a new package poincare in gensim/ would be a good idea? Other models do this too (e.g. topic_coherence)
There was a problem hiding this comment.
I don't think that this is a very good idea, after refactoring, topic_coherence will be moved/renamed in "deep" of gensim.modules (coherence contains only inner/secondary functions, not public API), but in your case, API is public.
I agree about imports (it's really untrivial, how to import from docs/notebooks if you in /randomfolder, only manually with importlib I think.
We have many viz helpers (produced by @parulsethi on GSoC) + now Parul works on very nice viz for topic models. Potentially, we can create the distinct repository (like gensim-data) and move all viz helpers, or, as you suggest, create submodule gensim.viz and move all viz stuff (not only your Poincare viz).
Hard question, I don't know what's better right now.
WDYT @piskvorky @janpom @parulsethi?
There was a problem hiding this comment.
A submodule gensim.viz sounds good to me, keeping in mind we might have future visualizations too. I don't have a good enough perspective on this though, so whatever you decide is okay with me.
There was a problem hiding this comment.
Hi @menshikh-iv , so is it okay if I create a gensim.viz package for this and any future gensim visualizations, and move the poincare visualization there?
There was a problem hiding this comment.
There was a problem hiding this comment.
gensim.viz submodule would be useful for #1616 also in future, and the long code blocks of network graph/dendrogram could also be wrapped up in a function under this module so that those visualizations can be produced simply using the imports.
There was a problem hiding this comment.
Great! Thanks for the feedback @parulsethi @menshikh-iv
Pushed changes with a new gensim.viz package.
| if self.workers > 1: | ||
| raise NotImplementedError("Multi-threaded version not implemented yet") | ||
| # Some divide-by-zero results are handled explicitly | ||
| old_settings = np.seterr(divide='ignore', invalid='ignore') |
There was a problem hiding this comment.
Why? You mean that division by zero is expected and you process this situation in code, I'm correct?
There was a problem hiding this comment.
Yeah, it happens in PoincareBatch. I'm setting it here to avoid repeated calls to np.seterr
|
|
||
| Parameters | ||
| ---------- | ||
| node : str |
There was a problem hiding this comment.
This always str (or int possible too)? This question more global (about all methods) that pass node argument?
There was a problem hiding this comment.
It could be an int too in theory, depending on what the vocab keys are. The most common case is str though. How would you prefer to handle this?
There was a problem hiding this comment.
maybe str or int everywhere?
There was a problem hiding this comment.
Done. Also made some other changes to docstrings for more clarity.
|
Some completely unrelated tests seem to be failing on travis ( |
|
@menshikh-iv we need this resolved & finished -- can you have a look? Cheers. |
This branch includes the changes to
KeyedVectorsandPoincareKeyedVectorsfor thePoincareModel.Includes commits from #1696 , to be reviewed and merged after #1696
TODOs:
KeyedVectormethods likemost_similartoPoincareKeyedVectorsKeyedVectorsintoKeyedVectorsBaseandEuclideanKeyedVectorsfor cleaner class hierarchyPoincareKeyedVectors