Fix OverflowError when loading a large term-document matrix in MatrixMarket format. Fix #1998#2001
Conversation
update to latest dev branch
|
@arlenk please have a look to error - https://ci.appveyor.com/project/piskvorky/gensim-431bq/build/job/87sjiebimbr726ya#L306 (looks like a problem with Descriptions of reason (python2) >>> isinstance(2, int)
True
>>> isinstance(2L, int)
False |
unfortunately I can't seem to replicate this on osx, but I think you are right, the problem is that python 2 still treats ints and longs differently. I updated the tests to use Hopefully this will fix the issue on the windows build. |
|
Thanks for fast fix @arlenk 👍 |
cython mmreader was (incorrectly) assuming that num_docs, num_terms, etc. would fit in a c int.
This version uses long longs for all related types.
Related issue from @KartikTaskhuman: #1998