[MRG] FIX problem of learning rate#226
Conversation
Codecov Report
@@ Coverage Diff @@
## master #226 +/- ##
===========================================
- Coverage 75.48% 57.05% -18.43%
===========================================
Files 4 7 +3
Lines 673 1311 +638
Branches 148 263 +115
===========================================
+ Hits 508 748 +240
- Misses 128 494 +366
- Partials 37 69 +32
Continue to review full report at Codecov.
|
9cf9503 to
87bc267
Compare
|
@pavanramkumar this PR might fix our |
bfb5f46 to
e8dc899
Compare
|
Note also that in the community crime example, the R^2 for the grid search method matches what we find with reg path |
|
@pavanramkumar with the fix in #249 the group lasso example gives reasonable R^2. So, it's fixing 2 or 3 examples. The only problem is that it slows down the convergence. I don't know off the bat how to solve that. Maybe profiling + cython can help. |
|
@pavanramkumar have you had time to look at this PR? It's related to our discussion today |
|
@jasmainak i looked at it, but i don't know whether we can use this approach for all the distrs. also, how much slower is it with line search in each iteration? can we write a test? |
|
I think line search does not make any assumption about convexity -- so it should apply to all the distributions. Sure I can add tests and report back some benchmarks |
2f4eb8e to
70d09a0
Compare
|
@pavanramkumar I rebased this PR with master. It does solve some issues. See above. But travis needs to be made happy ... |
70d09a0 to
1c7989b
Compare
f426ded to
1c39c8a
Compare
|
@pavanramkumar here is the script to benchmark. And here is what I get on my computer for reg_lambda | Learning rate | glm.n_iter_ | time (s)
0.50000 | 0.001 | 286 | 9.154
0.32374 | 0.001 | 286 | 9.737
0.20961 | 0.001 | 286 | 8.534
0.13572 | 0.001 | 286 | 9.067
0.08788 | 0.001 | 286 | 8.560
0.05690 | 0.001 | 286 | 9.609
0.03684 | 0.001 | 286 | 9.310
0.02385 | 0.001 | 286 | 11.780
0.01544 | 0.001 | 286 | 8.841
0.01000 | 0.001 | 294 | 8.102
0.50000 | 0.01 | 91 | 2.278
0.32374 | 0.01 | 91 | 2.298
0.20961 | 0.01 | 91 | 2.396
0.13572 | 0.01 | 91 | 2.277
0.08788 | 0.01 | 94 | 2.388
0.05690 | 0.01 | 103 | 2.793
0.03684 | 0.01 | 153 | 3.907
0.02385 | 0.01 | 164 | 4.228
0.01544 | 0.01 | 259 | 6.560
0.01000 | 0.01 | 277 | 6.567
0.50000 | 0.1 | 19 | 0.455
0.32374 | 0.1 | 33 | 0.776
0.20961 | 0.1 | 32 | 1.315
0.13572 | 0.1 | 65 | 2.160
0.08788 | 0.1 | 89 | 2.927
0.05690 | 0.1 | 132 | 5.334
0.03684 | 0.1 | 179 | 5.463
0.02385 | 0.1 | 219 | 6.056
/Users/mainak/Documents/github_repos/pyglmnet/pyglmnet/pyglmnet.py:885: UserWarning: Reached max number of iterations without convergence.
"Reached max number of iterations without convergence.")
0.01544 | 0.1 | 1000 | 35.344
0.01000 | 0.1 | 1000 | 26.787
/Users/mainak/Documents/github_repos/pyglmnet/pyglmnet/pyglmnet.py:250: RuntimeWarning: invalid value encountered in true_divide
grad_beta0 = np.sum(grad_mu) - np.sum(y * grad_mu / mu)
/Users/mainak/Documents/github_repos/pyglmnet/pyglmnet/pyglmnet.py:252: RuntimeWarning: invalid value encountered in true_divide
np.dot((y * grad_mu / mu).T, X)).T)
/Users/mainak/Documents/github_repos/pyglmnet/pyglmnet/pyglmnet.py:652: RuntimeWarning: invalid value encountered in greater
(np.abs(beta) > thresh)
/Users/mainak/Documents/github_repos/pyglmnet/pyglmnet/pyglmnet.py:100: RuntimeWarning: invalid value encountered in greater
mu[z > eta] = z[z > eta] * np.exp(eta) + beta0
/Users/mainak/Documents/github_repos/pyglmnet/pyglmnet/pyglmnet.py:101: RuntimeWarning: invalid value encountered in less_equal
mu[z <= eta] = np.exp(z[z <= eta])
/Users/mainak/Documents/github_repos/pyglmnet/pyglmnet/pyglmnet.py:117: RuntimeWarning: invalid value encountered in greater
grad_mu[z > eta] = np.ones_like(z)[z > eta] * np.exp(eta)
/Users/mainak/Documents/github_repos/pyglmnet/pyglmnet/pyglmnet.py:118: RuntimeWarning: invalid value encountered in less_equal
grad_mu[z <= eta] = np.exp(z[z <= eta])
0.50000 | 3 | 1000 | 25.019
0.32374 | 3 | 1000 | 25.106
0.20961 | 3 | 1000 | 24.828
0.13572 | 3 | 1000 | 25.927
0.08788 | 3 | 1000 | 24.657
0.05690 | 3 | 1000 | 25.544
0.03684 | 3 | 1000 | 35.799
0.02385 | 3 | 1000 | 43.277
0.01544 | 3 | 1000 | 28.077
0.01000 | 3 | 1000 | 26.254
0.50000 | auto | 36 | 3.604
0.32374 | auto | 30 | 3.045
0.20961 | auto | 34 | 3.298
0.13572 | auto | 1000 | 76.629
0.08788 | auto | 1000 | 93.815
0.05690 | auto | 1000 | 128.094
0.03684 | auto | 191 | 24.650
0.02385 | auto | 223 | 27.515
0.01544 | auto | 339 | 42.505
0.01000 | auto | 1000 | 129.048Not exactly sure if it's beneficial or not. It might be a bit risky to include in the release. But we might want to fix the |
closes #65
@pavanramkumar I am not sure if I got the lipschitz constant for logistic regression right, and we need to dig up the other Lipschitz constants or see how the R code deals with it
TODO