Remove outdated bz2 examples from tutorials#1867
Conversation
| >>> id2word = gensim.corpora.Dictionary.load_from_text('wiki_en_wordids.txt') | ||
| >>> # load corpus iterator | ||
| >>> mm = gensim.corpora.MmCorpus('wiki_en_tfidf.mm') | ||
| >>> # mm = gensim.corpora.MmCorpus(bz2.BZ2File('wiki_en_tfidf.mm.bz2')) # use this if you compressed the TFIDF output |
There was a problem hiding this comment.
Removing this line doesn't sound right -- we still support bz2!
Just remove the (superfluous) bz2.BZFile wrapper. Dtto below.
|
Why don't we support file-like objects in |
|
@piskvorky I see code for support from gensim.corpora import MmCorpus
import bz2
f = bz2.BZ2File("testcorpus.mm.bz2")
print(f.closed) # 0
corpus = MmCorpus(f)
print(f.closed) # 1 ???for this reason, if we try to read from this, we'll receive an exception suggested in mailing list. |
|
@piskvorky UPD, I found what's a reason for this behavior: in this line, we using This is a bug anyway (because internally we use |
* Revert "Remove outdated `bz2` + `MmCorpus` examples from tutorials (piskvorky#1867)" This reverts commit 5342153. * remove bz2 wrapper * remove bz2 wrapper[2]
MmReader support only
filenameas input (notfile-like object), but in the old documentation (wiki.rst/dist_lsi.rst) we usedfile-like objecttoo as input.Current PR remove this outdated usage from examples.
Based on mailing list post