Refactor API reference gensim.topic_coherence. Fix #1669#1714
Conversation
|
|
||
| Returns | ||
| ------- | ||
| float |
| This function calculates the log-conditional-probability measure | ||
| Calculate the log-conditional-probability measure | ||
| which is used by coherence measures such as U_mass. | ||
| This is defined as: m_lc(S_i) = log[(P(W', W*) + e) / P(W*)] |
There was a problem hiding this comment.
Use math here http://www.sphinx-doc.org/en/stable/ext/math.html
| Set to True to include standard deviation. | ||
| with_support : bool | ||
| Set to True to include number of elements in `segment_sims` as a statistic in the results returned. | ||
| segment_sims : iterable |
| with_support : bool | ||
| Set to True to include number of elements in `segment_sims` as a statistic in the results returned. | ||
| segment_sims : iterable | ||
| floating point similarity values to aggregate. |
There was a problem hiding this comment.
No need to describe type in comment (floating point), please move it to type
| @@ -124,7 +124,7 @@ def log_ratio_measure( | |||
| segmented_topics : list of (list of tuples) | |||
| Output from the segmentation module of the segmented topics. | |||
| accumulator: list | |||
| @@ -25,6 +26,7 @@ | |||
| \Bigg \{{\sum_{w_{i} \in W'}^{ } m(w_{i}, w_{j})^{\gamma}}\Bigg \}_{j = 1,...,|W|} | |||
There was a problem hiding this comment.
Use :math: for all formulas (please have a look to sphinx math/latex doc)
| ---------- | ||
| segmented_topics: list of (list of tuples) | ||
| Output from the segmentation module of the segmented topics. | ||
| accumulator: accumulator of word occurrences (see text_analysis module). |
There was a problem hiding this comment.
it isn't a type, you can add a link to concrete class like + links to module
:class:`~gensim.topic_coherence...`
| measure : str | ||
| Direct confirmation measure to be used. Supported values are "nlr" (normalized log ratio). | ||
| gamma: | ||
| Gamma value for computing W', W* vectors; default is 1. |
There was a problem hiding this comment.
:math: + no need to duplicate default value if it already defined OK in the signature (here and everywhere)
|
|
||
| Examples | ||
| -------- | ||
| >>> from gensim.corpora.dictionary import Dictionary |
There was a problem hiding this comment.
So big example, need comments like "what happens here"
|
@CLearERR please merge latest |
984872e to
540021c
Compare
|
Need to resolve #1807 first (because we need Current PR is almost done (will be finished/merged after CI for docs) |
* Add wordnet mammal train file for Poincare notebook (piskvorky#1781) * Adds wordnet mammal train file * Adds link to data file in notebook * Fix tox.ini/setup.cfg configuration (piskvorky#1815) * update according to new pytest_benchmark version * update wheel-storage url * use only twine * Fix docstrings for `gensim.utils` (piskvorky#1797) * Add docstrings in numpy-style fromat * fix PEP8 * remove outdated "hack" (smart_open is core dependency right now) * fix docstrings[1] * remove unused internal class * fix docstrings[2] * fix docstrings[3] * fix docstrings[4] * fix docstrings[5] * fix docstrings[6] * fix docstrings[7] * fix docstrings[8] * add missing `pattern` to doc dependencies * fix docstrings[9] * fix docstrings[10] * Fix docstrings for `gensim.models.rpmodel` (piskvorky#1802) * first attempt to convert few lines into numpy-style doc * added parameters in documentation * more documentation * few corrections * show inheritance and undoc members * show special members * example is executable now * link to the paper added, named parameters * fixed doc * fixed doc * fixed whitespaces * fix docstrings & PEP8 * fix docstrings * fix typo * Fix docstrings for `gensim.models.translation_matrix` (piskvorky#1806) * convert Space class doc to numpy style * fix docstrings[1] * fix docstrings[2] * remove useless load * fix docstrings[3] * add missing import * fix docstrings[4] * Add CircleCI for build documentation. Fix piskvorky#1807 (piskvorky#1822) * init config for circle * change * rm cache, install tox distinctly * fix indentation & command * update venv * add pip-cache * add apt packages for latex * rename * enable latex rendering * remove doc building from Travis * store new doc version * Refactor API reference `gensim.topic_coherence`. Fix piskvorky#1669 (piskvorky#1714) * Refactored aggregation * Micro-Fix for aggregation.py, partially refactored direct_confirmation.py * Partially refactored indirect_confirmation_measure * Some additions * Math attempts * add math extension for sphinx * Minor refactoring * Some refactoring for probability_estimation * Beta-strings * Different additions * Minor changes * text_analysis left * Added example for ContextVectorComputer class * probability_estimation 0.9 * beta_version * Added some examples for text_analysis * text_analysis: corrected example for class UsesDictionary * Final additions for text_analysis.py * fix cross-reference problem * fix pep8 * fix aggregation * fix direct_confirmation_measure * fix types in direct_confirmation_measure * partial fix indirect_confirmation_measure * HotFix for probability_estimation and segmentation * Refactoring for probability_estimation * Changes for indirect_confirmation_measure * Fixed segmentation, partly fixed text_analysis * Add Notes for text_analysis * fix di/ind * fix doc examples in probability_estimation * fix probability_estimation * fix segmentation * fix docstring in probability_estimation * partial fix test_analysis * add latex stuff for docs build * doc fix[1] * doc fix[2] * remove apt install from travis (now doc build in circle) * Fix docstrings for `gensim.models.normmodel` (piskvorky#1805) * First edits * changed bow * Added examples * Final commit of the night * Still struggling with docs * Removed examples but still struggling with documentation * fix docstring * fix docstring[2] * Fix docstrings for `gensim.models.logentropy_model` (piskvorky#1803) * improve and correct documentation of models/logentropy_model * include fixes according to comments * implement fixes suggested * associate methods with examples. * fix minor typos * doc fix * Fix docstrings for `gensim.matutils` (piskvorky#1804) * numpy style documentation on matutils.py * doc fix[1] * doc fix[2] * doc fix[3] * doc fix[4] * doc fix[5] * doc fix[6] * Fix formula in `gensim.summarization.bm25`. Fix piskvorky#1828 (piskvorky#1833) * bm25 scoring function updated * Fixes piskvorky#1828 * Fixes piskvorky#1828 * Fixes piskvorky#1828 * Fixes piskvorky#1828 * Fixes piskvorky#1828 * Fixes piskvorky#1828 , Tests added * Fixes piskvorky#1828 , Tests added * Fixes piskvorky#1828 , Tests Added * Fixes piskvorky#1828 , Tests Added * Fixes piskvorky#1828 , Tests Added * Fixes piskvorky#1828 , Tests Added * Fixes piskvorky#1828 * Refactor tests for `gensim.corpora.WikiCorpus`(piskvorky#1821) * minor style refactoring and comment fixes in accordance to PEP8 * Created test data in legitimate compressed XML format (.xml.bz2) for the WikiCorpus class. * Used the same raw data found for other sources (9 articles). * Added Various wiki markup to test the parsing regural expressions * Added test class for the WikiCorpus source. * Following the same inheritance schema as in the source TestWikiCorpus > TestTextCorpus > CorpusTestCase. * Testing methods are overriden where necessary to reflect logic changes. * All existing functionality is tested (account for markup handling, minimum article length etc) * Fix python 3 compatibility for generator next method * code review corrections * Moved WikiCorpus tests from test/test_wikicorpus.py into its class within the test_corpora.py file. * Adapted all old tests to the new class * Current Test class schema ensures that WikiCorpus also passes tests defined in parents * Deleted test_wikicorpus.py since it is now redundant * Discarded the empty input test for the WikiCorpus since an empty file is not legitimate XML * Added 2 more tests * Fix parameter setting for `FastText.train`. Fix piskvorky#1818 (piskvorky#1837) * bm25 scoring function updated * Fixes piskvorky#1828 * Fixes piskvorky#1828 * Fixes piskvorky#1828 * Fixes piskvorky#1828 * Fixes piskvorky#1828 * Fixes piskvorky#1828 , Tests added * Fixes piskvorky#1828 , Tests added * Fixes piskvorky#1828 , Tests Added * Fixes piskvorky#1828 , Tests Added * Fixes piskvorky#1828 , Tests Added * Fixes piskvorky#1828 , Tests Added * Fixes piskvorky#1828 * Function Parameters corrected , Fixes piskvorky#1818 * add missing params + add supercall * Fix positional params used for `gensim.models.CoherenceModel` in `gensim.models.callbacks` (piskvorky#1823) * add keyword params for call to gensim.models.CoherenceModel as positional arguments for coherence and topn were incorrect due to skipping param for keyed_vectors * Fix PEP8
…into smart_open * 'develop' of https://github.com/RaRe-Technologies/gensim: Fix positional params used for `gensim.models.CoherenceModel` in `gensim.models.callbacks` (piskvorky#1823) Fix parameter setting for `FastText.train`. Fix piskvorky#1818 (piskvorky#1837) Refactor tests for `gensim.corpora.WikiCorpus`(piskvorky#1821) Fix formula in `gensim.summarization.bm25`. Fix piskvorky#1828 (piskvorky#1833) Fix docstrings for `gensim.matutils` (piskvorky#1804) Fix docstrings for `gensim.models.logentropy_model` (piskvorky#1803) Fix docstrings for `gensim.models.normmodel` (piskvorky#1805) Refactor API reference `gensim.topic_coherence`. Fix piskvorky#1669 (piskvorky#1714) Add CircleCI for build documentation. Fix piskvorky#1807 (piskvorky#1822) Fix docstrings for `gensim.models.translation_matrix` (piskvorky#1806) Fix docstrings for `gensim.models.rpmodel` (piskvorky#1802) Fix docstrings for `gensim.utils` (piskvorky#1797) Fix tox.ini/setup.cfg configuration (piskvorky#1815) Add wordnet mammal train file for Poincare notebook (piskvorky#1781)
No description provided.