proximal gradient method @ group lasso fixed#249
proximal gradient method @ group lasso fixed#249pavanramkumar merged 7 commits intoglm-tools:masterfrom AnchorBlues:_prox_fix
Conversation
Codecov Report
@@ Coverage Diff @@
## master #249 +/- ##
==========================================
- Coverage 56.8% 56.75% -0.05%
==========================================
Files 7 7
Lines 1301 1302 +1
Branches 261 261
==========================================
Hits 739 739
- Misses 492 495 +3
+ Partials 70 68 -2
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #249 +/- ##
==========================================
+ Coverage 56.8% 58.44% +1.64%
==========================================
Files 7 7
Lines 1301 1302 +1
Branches 261 261
==========================================
+ Hits 739 761 +22
+ Misses 492 472 -20
+ Partials 70 69 -1
Continue to review full report at Codecov.
|
|
Thanks. Could you add a tiny test? |
|
closes #235 |
|
@jasmainak |
|
Think about how you discovered the bug. The test should basically reproduce that You can add the test in this function. One easy check perhaps could be weather the estimated |
|
@jasmainak |
jasmainak
left a comment
There was a problem hiding this comment.
overall looks like a good start! If you can address my comments, I will take another look. Thanks again for the fix
tests/test_pyglmnet.py
Outdated
| # create an instance of the GLM class | ||
| glm_group = GLM(distr='softplus', alpha=1.) | ||
| lams = [0.5, 0.3237394, 0.2096144, 0.13572088, 0.08787639, | ||
| 0.0568981, 0.03684031, 0.02385332, 0.01544452, 0.01] |
There was a problem hiding this comment.
do we really need all these lams for the test? It will make the tests slow. I would do only one lam
There was a problem hiding this comment.
I understand.
I will choice 0.2 as one lam value.
if lam==0.2 in that case, I made sure that some groups to be zero, others not to be zero.
So I think it will be a good test case.
tests/test_pyglmnet.py
Outdated
|
|
||
| for lam in lams: | ||
| # create an instance of the GLM class | ||
| glm_group = GLM(distr='softplus', alpha=alpha, reg_lambda=lam, group=groups) |
There was a problem hiding this comment.
please use the same variable names we use in the rest of the code base. Instead of lam it should be reg_lambda
There was a problem hiding this comment.
OK, I'll change variable name lam to reg_lambda.
tests/test_pyglmnet.py
Outdated
| beta = glm_group.beta_ | ||
| group_norms = np.abs(beta) | ||
| for target_group_idx in unique_group_idxs: | ||
| if target_group_idx == 0: |
There was a problem hiding this comment.
Please check the _prox method of GLM class (https://github.com/glm-tools/pyglmnet/blob/master/pyglmnet/pyglmnet.py).
Line 519 ~ 522.
I just wrote the same process.
tests/test_pyglmnet.py
Outdated
| # in each group, coef must be [all nonzero] or [all zero]. | ||
| unique_group_idxs = np.unique(groups) | ||
| beta = glm_group.beta_ | ||
| group_norms = np.abs(beta) |
There was a problem hiding this comment.
this is confusing. You should explicitly initialize an array of zeros
There was a problem hiding this comment.
OK, I'll fix it.
However, I point out that the _prox method of GLM class is writen the same way.
|
@jasmainak assert (beta[group_norms <= thresh] == 0.0).all()is not appropriate because there is a possibility that |
|
|
||
| target_beta = beta[groups == group_id] | ||
| n_nonzero = (target_beta != 0.0).sum() | ||
| assert n_nonzero in (len(target_beta), 0) |
There was a problem hiding this comment.
I like the test as it's compact. But the problem is that it passes on master. Ideally, you want the test to fail on master branch so the bug does not happen again
There was a problem hiding this comment.
OK, I added the following assertion to the test code.
assert (beta[groups != 0] == 0.0).any()I made sure that this assertion assert False at master branch, and assert True after fixed.
tests/test_pyglmnet.py
Outdated
| reg_lambda = 0.2 | ||
| # create an instance of the GLM class | ||
| glm_group = GLM(distr='softplus', alpha=1.) | ||
| glm_group = GLM(distr='softplus', alpha=alpha, reg_lambda=reg_lambda, group=groups) |
There was a problem hiding this comment.
can you make sure you conform to pep8 conventions? You can add a plugin to your editor or install a command-line checker. Right now, the line exceeds 79 characters
There was a problem hiding this comment.
I see.
I fixed the test code so that it conforms to pep8 conventions.
jasmainak
left a comment
There was a problem hiding this comment.
hopefully last round of changes. Thanks a lot!
tests/test_pyglmnet.py
Outdated
| assert n_nonzero in (len(target_beta), 0) | ||
|
|
||
| # one of the groups must be [all zero] | ||
| assert (beta[groups != 0] == 0.0).any() |
There was a problem hiding this comment.
nopes, this is not checking that all the elements of a group are zero. You need to do something like:
assert np.any([beta[groups == g].sum() == 0 for g in group_ids[1:]])There was a problem hiding this comment.
also, there needs to be one more emtpy line after this according to pep8
There was a problem hiding this comment.
@jasmainak
Thanks.
I fixed the code according to the above advice.
Please confirm,
| assert n_nonzero in (len(target_beta), 0) | ||
|
|
||
| # one of the groups must be [all zero] | ||
| assert np.any([beta[groups == group_id].sum() == 0 \ |
There was a problem hiding this comment.
you don't need the trailing backslash here but I can live with it
|
@pavanramkumar please merge if you are happy. LGTM |
|
Thanks @AnchorBlues and @jasmainak — great contribution! |
|
@AnchorBlues looking forward to more contributions and bugfixes. Thanks! |
|
@jasmainak @pavanramkumar |
In proximal gradient method, weights where those absolute values are smaller than the threshold must be 0.