Skip to content

feat: Sparse probing eval based on "Are SAEs useful" paper#79

Merged
adamkarvonen merged 8 commits intoadamkarvonen:mainfrom
chanind:sae-probes-sparse-probing
Dec 30, 2025
Merged

feat: Sparse probing eval based on "Are SAEs useful" paper#79
adamkarvonen merged 8 commits intoadamkarvonen:mainfrom
chanind:sae-probes-sparse-probing

Conversation

@chanind
Copy link
Contributor

@chanind chanind commented Oct 4, 2025

This PR adds a sparse-probing eval called sparse_probing_sae_probes to keep it separate from the original sparse-probing eval in SAEBench. This eval is based on the SAE-Probes paper Are Sparse Autoencoders Useful? A Case Study in Sparse Probing. The benefit of the sparse-probing tasks from this paper are the following:

  • Lots of datasets: The SAE-Probes paper evaluates on over 140 sparse probing datasets
  • cross-validated probing: The SAE-Probes paper optimizes the probing pretty heavily to give a stronger realistic baseline to compare against.

This benchmark wraps the standalone sae-probes package, putting results in SAEBench format.

@chanind chanind marked this pull request as draft October 12, 2025 23:04
@chanind chanind marked this pull request as ready for review December 23, 2025 20:58
@chanind chanind changed the title Sparse probing eval based on "Are SAEs useful" paper feat: Sparse probing eval based on "Are SAEs useful" paper Dec 23, 2025
@chanind chanind marked this pull request as draft December 23, 2025 22:28
@chanind chanind marked this pull request as ready for review December 24, 2025 00:39
@adamkarvonen adamkarvonen merged commit c1a27da into adamkarvonen:main Dec 30, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants