proximal gradient method @ group lasso fixed by AnchorBlues · Pull Request #249 · glm-tools/pyglmnet

AnchorBlues · 2018-08-31T04:31:51Z

In proximal gradient method, weights where those absolute values are smaller than the threshold must be 0.

codecov-io · 2018-08-31T04:34:50Z

Codecov Report

Merging #249 into master will decrease coverage by 0.04%.
The diff coverage is 0%.

@@            Coverage Diff             @@
##           master     #249      +/-   ##
==========================================
- Coverage    56.8%   56.75%   -0.05%     
==========================================
  Files           7        7              
  Lines        1301     1302       +1     
  Branches      261      261              
==========================================
  Hits          739      739              
- Misses        492      495       +3     
+ Partials       70       68       -2

Impacted Files	Coverage Δ
pyglmnet/pyglmnet.py	`73.33% <0%> (-0.17%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 84869a0...bfdb111. Read the comment docs.

codecov-io · 2018-08-31T04:37:37Z

Codecov Report

Merging #249 into master will increase coverage by 1.64%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master     #249      +/-   ##
==========================================
+ Coverage    56.8%   58.44%   +1.64%     
==========================================
  Files           7        7              
  Lines        1301     1302       +1     
  Branches      261      261              
==========================================
+ Hits          739      761      +22     
+ Misses        492      472      -20     
+ Partials       70       69       -1

Impacted Files	Coverage Δ
pyglmnet/pyglmnet.py	`78.22% <100%> (+4.72%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 84869a0...5b9668e. Read the comment docs.

jasmainak · 2018-08-31T05:31:33Z

Thanks. Could you add a tiny test?

jasmainak · 2018-08-31T05:38:32Z

closes #235

AnchorBlues · 2018-08-31T12:51:18Z

@jasmainak
Sorry, I don't know what kind of test code I should write.
I would be glad if you show me some example test code.

jasmainak · 2018-08-31T16:50:35Z

Think about how you discovered the bug. The test should basically reproduce that
behavior in some sense. It should fail on master but pass on this branch.

You can add the test in this function. One easy check perhaps could be weather the estimated
betas have the same non-zero indices as the simulated one.

AnchorBlues · 2018-09-01T06:59:29Z

@jasmainak
Thank you for your advice.
I added test code to test_group_lasso function.
Please confirm.

jasmainak

overall looks like a good start! If you can address my comments, I will take another look. Thanks again for the fix

jasmainak · 2018-09-01T07:11:42Z

tests/test_pyglmnet.py

-    # create an instance of the GLM class
-    glm_group = GLM(distr='softplus', alpha=1.)
+    lams = [0.5, 0.3237394, 0.2096144, 0.13572088, 0.08787639,
+            0.0568981, 0.03684031, 0.02385332, 0.01544452, 0.01]


do we really need all these lams for the test? It will make the tests slow. I would do only one lam

I understand.
I will choice 0.2 as one lam value.
if lam==0.2 in that case, I made sure that some groups to be zero, others not to be zero.
So I think it will be a good test case.

jasmainak · 2018-09-01T07:12:24Z

tests/test_pyglmnet.py

+
+    for lam in lams:
+        # create an instance of the GLM class
+        glm_group = GLM(distr='softplus', alpha=alpha, reg_lambda=lam, group=groups)


please use the same variable names we use in the rest of the code base. Instead of lam it should be reg_lambda

OK, I'll change variable name lam to reg_lambda.

jasmainak · 2018-09-01T07:16:04Z

tests/test_pyglmnet.py

+        beta = glm_group.beta_
+        group_norms = np.abs(beta)
+        for target_group_idx in unique_group_idxs:
+            if target_group_idx == 0:


Please check the _prox method of GLM class (https://github.com/glm-tools/pyglmnet/blob/master/pyglmnet/pyglmnet.py).
Line 519 ~ 522.
I just wrote the same process.

jasmainak · 2018-09-01T07:21:05Z

tests/test_pyglmnet.py

+        # in each group, coef must be [all nonzero] or [all zero].
+        unique_group_idxs = np.unique(groups)
+        beta = glm_group.beta_
+        group_norms = np.abs(beta)


this is confusing. You should explicitly initialize an array of zeros

OK, I'll fix it.
However, I point out that the _prox method of GLM class is writen the same way.

AnchorBlues · 2018-09-01T08:47:43Z

@jasmainak
I found that last assertion of test_group_lasso function,

assert (beta[group_norms <= thresh] == 0.0).all()

is not appropriate because there is a possibility that beta[group_norms <= thresh] will be nonzero if the values of result[idx_to_update] changes to the values less than thresh at the final iteration of optimization.
So I removed the assertion from the test code.

jasmainak · 2018-09-01T16:01:55Z

tests/test_pyglmnet.py

+
+        target_beta = beta[groups == group_id]
+        n_nonzero = (target_beta != 0.0).sum()
+        assert n_nonzero in (len(target_beta), 0)


I like the test as it's compact. But the problem is that it passes on master. Ideally, you want the test to fail on master branch so the bug does not happen again

OK, I added the following assertion to the test code.

assert (beta[groups != 0] == 0.0).any()

I made sure that this assertion assert False at master branch, and assert True after fixed.

jasmainak · 2018-09-01T16:03:20Z

tests/test_pyglmnet.py

+    reg_lambda = 0.2
    # create an instance of the GLM class
-    glm_group = GLM(distr='softplus', alpha=1.)
+    glm_group = GLM(distr='softplus', alpha=alpha, reg_lambda=reg_lambda, group=groups)


can you make sure you conform to pep8 conventions? You can add a plugin to your editor or install a command-line checker. Right now, the line exceeds 79 characters

I see.
I fixed the test code so that it conforms to pep8 conventions.

jasmainak

hopefully last round of changes. Thanks a lot!

jasmainak · 2018-09-02T02:41:43Z

tests/test_pyglmnet.py

+        assert n_nonzero in (len(target_beta), 0)
+
+    # one of the groups must be [all zero]
+    assert (beta[groups != 0] == 0.0).any()


nopes, this is not checking that all the elements of a group are zero. You need to do something like:

assert np.any([beta[groups == g].sum() == 0 for g in group_ids[1:]])

also, there needs to be one more emtpy line after this according to pep8

@jasmainak
Thanks.
I fixed the code according to the above advice.
Please confirm,

jasmainak · 2018-09-02T03:32:21Z

tests/test_pyglmnet.py

+        assert n_nonzero in (len(target_beta), 0)
+
+    # one of the groups must be [all zero]
+    assert np.any([beta[groups == group_id].sum() == 0 \


you don't need the trailing backslash here but I can live with it

jasmainak · 2018-09-02T03:32:42Z

@pavanramkumar please merge if you are happy. LGTM

pavanramkumar · 2018-09-04T04:05:21Z

Thanks @AnchorBlues and @jasmainak — great contribution!

jasmainak · 2018-09-04T04:09:54Z

@AnchorBlues looking forward to more contributions and bugfixes. Thanks!

AnchorBlues · 2018-09-04T14:34:10Z

@jasmainak @pavanramkumar
Thank you for reviewing my fix and merging it.
I am glad to contribute to your package.
I'll continue to use this package for data science.
Thanks!

proximal gradient method @ group lasso fixed

bfdb111

jasmainak mentioned this pull request Aug 31, 2018

[MRG] FIX problem of learning rate #226

Open

1 task

test code for group lasso writed

f14aefe

jasmainak requested changes Sep 1, 2018

View reviewed changes

AnchorBlues added 2 commits September 1, 2018 17:00

test code for group lasso fixed

0fb0606

removed assetion of last line because it is not appropriate.

b59e62d

jasmainak requested changes Sep 1, 2018

View reviewed changes

AnchorBlues added 2 commits September 2, 2018 10:46

pep8 fix

66e9f13

one of the groups must be [all zero]

0d3c3ae

jasmainak requested changes Sep 2, 2018

View reviewed changes

one of the groups must be [all zero] fixed.

5b9668e

jasmainak approved these changes Sep 2, 2018

View reviewed changes

jasmainak reviewed Sep 2, 2018

View reviewed changes

jasmainak requested a review from pavanramkumar September 2, 2018 21:55

pavanramkumar merged commit 65bb9f7 into glm-tools:master Sep 4, 2018

pavanramkumar mentioned this pull request Sep 4, 2018

Group lasso does not set any variables to zero #235

Closed

jasmainak mentioned this pull request May 19, 2019

The trained weight is not sparse at all #296

Closed

Conversation

AnchorBlues commented Aug 31, 2018

Uh oh!

codecov-io commented Aug 31, 2018

Codecov Report

Uh oh!

codecov-io commented Aug 31, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jasmainak commented Aug 31, 2018

Uh oh!

jasmainak commented Aug 31, 2018

Uh oh!

AnchorBlues commented Aug 31, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jasmainak commented Aug 31, 2018

Uh oh!

AnchorBlues commented Sep 1, 2018

Uh oh!

jasmainak left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AnchorBlues commented Sep 1, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jasmainak left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jasmainak commented Sep 2, 2018

Uh oh!

pavanramkumar commented Sep 4, 2018

Uh oh!

jasmainak commented Sep 4, 2018

Uh oh!

AnchorBlues commented Sep 4, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

codecov-io commented Aug 31, 2018 •

edited

Loading

AnchorBlues commented Aug 31, 2018 •

edited

Loading