Releases: saprmarks/dictionary_learning
v0.1.0
v0.1.0 (2025-02-12)
Feature
- feat: pypi packaging and auto-release with semantic release (
0ff8888)
Unknown
- Merge pull request #37 from chanind/pypi-package
feat: pypi packaging and auto-release with semantic release (a711efe)
-
simplify matryoshka loss (
43421f5) -
Use torch.split() instead of direct indexing for 25% speedup (
505a445) -
Fix matryoshka spelling (
aa45bf6) -
Fix incorrect auxk logging name (
784a62a) -
Add citation (
77f2690) -
Make sure to detach reconstruction before calculating aux loss (
db2b564) -
Merge pull request #36 from saprmarks/aux_loss_fixes
Aux loss fixes, standardize decoder normalization (34eefda)
-
Standardize and fix topk auxk loss implementation (
0af1971) -
Normalize decoder after optimzer step (
200ed3b) -
Remove experimental matroyshka temperature (
6c2fcfc) -
Make sure x is on the correct dtype for jumprelu when logging (
c697d0f) -
Import trainers from correct relative location for submodule use (
8363ff7) -
By default, don't normalize Gated activations during inference (
52b0c54) -
Also update context manager for matroyshka threshold (
65e7af8) -
Disable autocast for threshold tracking (
17aa5d5) -
Add torch autocast to training loop (
832f4a3) -
Save state dicts to cpu (
3c5a5cd) -
Add an option to pass LR to TopK trainers (
8316a44) -
Add April Update Standard Trainer (
cfb36ff) -
Merge pull request #35 from saprmarks/code_cleanup
Consolidate LR Schedulers, Sparsity Schedulers, and constrained optimizers (f19db98)
-
Consolidate LR Schedulers, Sparsity Schedulers, and constrained optimizers (
9751c57) -
Merge pull request #34 from adamkarvonen/matroyshka
Add Matroyshka, Fix Jump ReLU training, modify initialization (92648d4)
-
Add a verbose option during training (
0ff687b) -
Prevent wandb cuda multiprocessing errors (
370272a) -
Log dead features for batch top k SAEs (
936a69c) -
Log number of dead features to wandb (
77da794) -
Add trainer number to wandb name (
3b03b92) -
Add notes (
810dbb8) -
Add option to ignore bos tokens (
c2fe5b8) -
Fix jumprelu training (
ec961ac) -
Use kaiming initialization if specified in paper, fix batch_top_k aux_k_alpha (
8eaa8b2) -
Format with ruff (
3e31571) -
Add temperature scaling to matroyshka (
ceabbc5) -
norm the correct decoder dimension (
5383603) -
Fix loading matroyshkas from_pretrained() (
764d4ac) -
Initial matroyshka implementation (
8ade55b) -
Make sure we step the learning rate scheduler (
1df47d8) -
Merge pull request #33 from saprmarks/lr_scheduling
Lr scheduling (316dbbe)
-
Properly set new parameters in end to end test (
e00fd64) -
Standardize learning rate and sparsity schedules (
a2d6c43) -
Merge pull request #32 from saprmarks/add_sparsity_warmup
Add sparsity warmup (a11670f)
-
Add sparsity warmup for trainers with a sparsity penalty (
911b958) -
Clean up lr decay (
e0db40b) -
Track lr decay implementation (
f0bb66d) -
Remove leftover variable, update expected results with standard SAE improvements (
9687bb9) -
Merge pull request #31 from saprmarks/add_demo
Add option to normalize dataset, track thresholds for TopK SAEs, Fix Standard SAE (67a7857)
-
Also scale topk thresholds when scaling biases (
efd76b1) -
Use the correct standard SAE reconstruction loss, initialize W_dec to W_enc.T (
8b95ec9) -
Add bias scaling to topk saes (
484ca01) -
Fix topk bfloat16 dtype error (
488a154) -
Add option to normalize dataset activations (
81968f2) -
Remove demo script and graphing notebook (
57f451b) -
Track thresholds for topk and batchtopk during training (
b5821fd) -
Track threshold for batchtopk, rename for consistency (
32d198f) -
Modularize demo script (
dcc02f0) -
Begin creation of demo script (
712eb98) -
Fix JumpReLU training and loading (
552a8c2) -
Ensure activation buffer has the correct dtype (
d416eab) -
Merge pull request #30 from adamkarvonen/add_tests
Add end to end test...