[MRG] Poincare l2 regularization by jayantj · Pull Request #1734 · piskvorky/gensim

jayantj · 2017-11-22T02:01:36Z

Adds l2-regularization for Poincare embeddings, which improves results and results in models much more similar to the models visualized in the original paper. To be merged post #1700

Without regularization, the model looks like this -

The problem with this is that most of the nodes are too close to the boundary. In order to prevent this, l2-regularization is added to only the parent node in the training hypernymy relations, which results in the following model -

Also adds cleaner handling of autograd, where the import is not conditional, however if autograd cannot be imported, gradient checking is automatically skipped along with a warning.

…keys

…ectors and PoincareKeyedVectors

…fixes tests

… updates tests

…nce and vector_distance_batch

janpom · 2017-11-22T08:55:12Z

            Number of threads to use for training the model.
        epsilon : float, optional
            Constant used for clipping embeddings below a norm of one.
+        regularization_coeff : float, optional


Can the L2 regularization be disabled? Setting regularization_coeff = 0.0 should do it, correct? Might be worth mentioning here.

That's correct. Good point, adding.

…arization Conflicts: gensim/models/keyedvectors.py gensim/models/poincare.py gensim/viz/__init__.py gensim/viz/poincare.py

… poincare_l2_regularization

jayantj added 30 commits October 26, 2017 22:13

Initial classes and loading data for poincare model

6afdd22

Initial implementation of training using autograd

a804006

faster negative sampling, bugfix in vector updates

6bd0d4b

allows poincare dist function to be differentiable by autograd

98f94a7

batched gradient descent initial implementation

b727523

minor changes to batch poincare distance computation

1e6aee1

Adds calculation of gradients for poincare model

e286a0b

Correct implementation of clipping of updated vectors

3e28e8b

Fixes error in gradient computation

99a2270

Better messages while training

2e9e31c

Renames PoincareDistance to PoincareExample for clarity

d72cb10

Compares computed gradients to autograd gradients every few iterations

d439501

Avoids doing some numpy computations twice

e1ed24d

Avoids creating copies of numpy vectors

3b2a383

Only calls nan_to_num when gamma has at least one value equal to 1

7d68aae

Simply sets nan gradients to zero instead of nan_to_num

ba82d42

Adds batch-wise implementation of training and gradient computations

71f61d1

Minor correction in clipping

2a5a7fb

Merge branch 'poincare' into poincare_model

0c57aa1

Fixes typo in clip_vectors

9c51609

Prints average loss every few iterations instead of current loss

f22d9b2

Adds weighted negative sampling

7905c8c

Ensures positive edges are not returned by negative sampling

075df25

Poincare model stores node indices in relations instead of node keys

6060e56

Minor renaming; uses node indices for batch training instead of node …

8ea8f23

…keys

Changes shapes of vectors passed to PoincareBatch

b8d77e3

Minor bugfixes related to batch size

0011b93

Corrects implementation of negative sampling for batch training

b52ee2e

Adds option to check gradients in batchwise training

d247384

Checks gradients only every few iterations

8c4f5a3

jayantj added 13 commits November 22, 2017 05:05

Removes most_similar from KeyedVectorsBase

2b982ab

Adds failing tests for words_closer_than and rank for EuclideanKeyedV…

e31e816

…ectors and PoincareKeyedVectors

Adds distances method to KeyedVectorsBase and EuclideanKeyedVectors, …

d73c0e2

…fixes tests

Makes default argument for distances immutable

235b643

Uses conditional import for pygtrie in LexicalEntailmentEvaluation

d0b8563

Renames position_in_hierarchy to norm with minor change in behaviour,…

cedd0e1

… updates tests

Renames poincare_distance and poincare_distance_batch to vector_dista…

0317189

…nce and vector_distance_batch

Forces float division for positive_fraction in _sample_negatives

e693e64

Removes unused method from PoincareKeyedVectors

e931085

Updates report notebook with usage examples of new API methods

3c8d9f2

Minor pep8 fix

73ed696

Adds l2 regularization to poincare model training

c0ec021

Cleaner way to avoid dependency on autograd

7f1f02e

janpom reviewed Nov 22, 2017

View reviewed changes

jayantj added 13 commits November 23, 2017 06:01

Fixes pep8 issues, unused imports and typo

ee92be9

Adds example of saving and loading model to notebook

46a7efb

Updates docstrings in poincare.py

291dac6

Updates docstring for regularization coefficient in PoincareModel

5992b37

Moves poincare visualization methods to new gensim.viz module

c532e6e

Updates rst files for poincare viz

c506b96

Adds newline at the end of poincare.py in viz package

b4ec393

Adds link to original paper to poincare notebook

a7c3080

Adds l2 regularization to poincare model training

d20d752

Cleaner way to avoid dependency on autograd

96efe5d

Updates docstring for regularization coefficient in PoincareModel

fb56f51

Merge remote-tracking branch 'origin/poincare' into poincare_l2_regul…

861972f

…arization Conflicts: gensim/models/keyedvectors.py gensim/models/poincare.py gensim/viz/__init__.py gensim/viz/poincare.py

Merge remote-tracking branch 'origin/poincare_l2_regularization' into…

9565a5b

… poincare_l2_regularization

menshikh-iv merged commit 4ab63c6 into poincare Dec 4, 2017

jayantj mentioned this pull request Dec 4, 2017

[MRG] Add Poincare model #1757

Merged

menshikh-iv deleted the poincare_l2_regularization branch July 5, 2018 17:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[MRG] Poincare l2 regularization#1734

[MRG] Poincare l2 regularization#1734
menshikh-iv merged 192 commits into
poincarefrom
poincare_l2_regularization

jayantj commented Nov 22, 2017 •

edited by janpom

Loading

Uh oh!

janpom Nov 22, 2017

Uh oh!

jayantj Nov 23, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

jayantj commented Nov 22, 2017 • edited by janpom Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

janpom Nov 22, 2017

Choose a reason for hiding this comment

Uh oh!

jayantj Nov 23, 2017

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jayantj commented Nov 22, 2017 •

edited by janpom

Loading