Fix phrases load, for backward compatibility#1758
Conversation
|
@menshikh-iv it's ready to validate. |
piskvorky
left a comment
There was a problem hiding this comment.
Thanks for the fix! What is the rationale behind introducing the new class (PhrasesTransformation)?
| """ | ||
|
|
||
| # for python 2 and 3 compatibility. basestring is used to check if model.scoring is a string | ||
| try: |
There was a problem hiding this comment.
Gensim uses the six library to bridge py2/py3.
There was a problem hiding this comment.
ok, note that it was from a previous commit…
| raise ValueError( | ||
| 'failed to load %s model with unknown scoring setting %s' % | ||
| (cls.__name__, model.scoring)) | ||
| # if there is non common_terms attribute, inizialize |
Phrases and Phraser needs the same |
|
@piskvorky fixed. |
menshikh-iv
left a comment
There was a problem hiding this comment.
Thanks for fast fix @alexgarel 🔥, the last fixes remain :)
|
|
||
| with temporary_file("test.pkl") as fpath: | ||
| bigram = Phrases(self.sentences, min_count=1, threshold=1) | ||
| del(bigram.common_terms) |
There was a problem hiding this comment.
better solution - checkout to the previous version, tran+save model and try to load old model here (in your current implementation, we possibly skip other bugs, that can be hidden due to the current bug)
FYI - del isn't a function, this is operator -> no needed brackets.
| if hasattr(model, 'scoring'): | ||
| if isinstance(model.scoring, six.string_types): | ||
| if model.scoring == 'default': | ||
| logger.info( |
There was a problem hiding this comment.
nitpick: we use 140 char limits for code, no need to split this call into several lines
There was a problem hiding this comment.
argh travis failed, it was 120 not 140 @menshikh-iv, you joker ! ;-)
There was a problem hiding this comment.
@alexgarel oh, sorry, my mistake (120 instead of 80, not 140)
| if sys.version_info[0] >= 3: | ||
| unicode = str | ||
|
|
||
| @contextlib.contextmanager |
There was a problem hiding this comment.
This looks useful, please move it to gensim.test.utils https://github.com/RaRe-Technologies/gensim/blob/develop/gensim/test/utils.py
… bug in loading phrases model before scoring). Also moving temporary_file context manager in gensim.test.utils
|
@menshikh-iv this should be ok. Your proposal to test load backward compatibility was right, for I extend it to scoring and found there was a bug in it. |
|
Big thanks for fast fix @alexgarel, I checked with 2.0.0, 1.0.1 - works fine! |
This should fix #1751.
Also it gives backward compatible load for Phraser.