Fix #1196 - Gensim error when loading FastText#1214
Conversation
|
Travis tests re-ran after smart_open update |
|
What is the purpose of adding a new attribute |
|
Thanks @tmylk for the comment. To my understanding, you would use FastText in the case you want to load both the vec and bin files of the fasttext. In case you just want to load the vectors you can use FastTextKeyedVectors. As you pointed out, you can use By adding it to the FastText class you meant FastTextKeyedVectors class, right? EDIT: It seems like it should be Aslo, there is currenty no override of Looking forward to your suggestions. |
|
Furthermore, you cannot just use FastTextKeyedVectors without FastText initialization (which needs both vec and bin) as |
tmylk
left a comment
There was a problem hiding this comment.
Just indent changes requested. Glad it became a comprehensive fix!
Please add a note in the changelog.md as well.
| # In vocab, sanity check | ||
| self.assertEqual(len(self.test_model.most_similar_cosmul(positive=['the', 'and'], topn=5)), 5) | ||
| self.assertEqual(self.test_model.most_similar_cosmul('the'), self.test_model.most_similar_cosmul(positive=['the'])) | ||
| self.assertEqual(self.test_model.most_similar_cosmul('the'), |
| # Out of vocab check | ||
| self.assertEqual(len(self.test_model.most_similar_cosmul(['night', 'nights'], topn=5)), 5) | ||
| self.assertEqual(self.test_model.most_similar_cosmul('nights'), self.test_model.most_similar_cosmul(positive=['nights'])) | ||
| self.assertEqual(self.test_model.most_similar_cosmul('nights'), |
|
Thanks for reviewing the code. It seems like the build for python 2 stalled, could you rerun it @tmylk please? |
|
These tests are know to occasionally fail but it's the first time they fail constantly. Will disable them in the main branch soon. |
|
Ok. Let me know if I can help with anything. |
Gensim can load large fasttext model on Mac
Loading of binary fasttext models is faster
Fasttext vector size is correctly set
Vector_size is also saved when loading just the vector_file, which can be useful if you are not interested in training