Skip to content

Fix fragment match double counting#183

Open
JemmaLDaniel wants to merge 2 commits intofeat-refactor-calibration-featuresfrom
fix-fragment-match-double-counting
Open

Fix fragment match double counting#183
JemmaLDaniel wants to merge 2 commits intofeat-refactor-calibration-featuresfrom
fix-fragment-match-double-counting

Conversation

@JemmaLDaniel
Copy link
Copy Markdown
Collaborator

Fix fragment ion double-counting in peak matching

Summary

Fixes a bug in find_matching_ions where an observed peak that was already matched as part of one theoretical ion's isotopic envelope could be re-matched as the M0 peak of a different theoretical ion. This inflated both ion_matches and ion_match_intensity, producing overly optimistic match rates.

The fix tracks matched observed peak indices in a set and skips any peak that has already been assigned, ensuring each observed peak contributes to at most one theoretical ion match.

Changes

  • winnow/calibration/features/utils.pyfind_matching_ions now maintains a matched_indices set of already-assigned observed peak indices. Before matching an M0 peak, the function checks whether that index has already been claimed (either as a prior M0 or as part of a prior isotopic envelope). Isotopic envelope peaks are also added to the set when matched.
  • tests/calibration/features/test_utils.py — adds targeted test cases that construct spectra where the same observed peak could be matched by multiple theoretical ions, verifying that only the first match is counted.

@JemmaLDaniel JemmaLDaniel self-assigned this Apr 10, 2026
@JemmaLDaniel JemmaLDaniel added the bug Something isn't working label Apr 10, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 10, 2026

Coverage

Coverage Report
FileStmtsMissCoverMissing
__init__.py00100% 
data_types.py40100% 
calibration
   __init__.py00100% 
   calibration_features.py50100% 
   calibrator.py911583%69–70, 72, 106–109, 134–135, 137, 162–163, 167, 194–195
calibration/features
   __init__.py100100% 
   base.py80100% 
   beam.py470100% 
   chimeric.py77198%198
   constants.py10100% 
   fragment_match.py73198%190
   mass_error.py28292%71, 75
   retention_time.py77198%160
   sequence.py190100% 
   token_score.py37197%82
   utils.py130199%215
compat
   __init__.py00100% 
   instanovo.py10640%12, 14–15, 17, 24–25
datasets
   __init__.py00100% 
   calibration_dataset.py1091784%155, 169, 171, 173, 183, 196, 249, 251–252, 258–261, 263–266
   data_loaders.py2701494%23, 189, 220–221, 414, 455, 847, 851, 900, 911, 1023–1024, 1052–1053
   interfaces.py30100% 
   psm_dataset.py250100% 
fdr
   __init__.py00100% 
   base.py581574%81, 85–86, 91, 98–99, 105, 126, 129–130, 135, 137–138, 144, 186
   database_grounded.py28196%52
   nonparametric.py25484%62, 68–69, 72
scripts
   __init__.py00100% 
   main.py1851850%8, 10–13, 16–20, 23–24, 26–28, 32, 39, 44, 47, 53, 55–56, 59, 68, 76, 79, 86, 88–90, 92, 94–99, 102, 104–105, 110, 125, 128, 135–141, 144–145, 148, 161–163, 166, 169, 174, 176–178, 180, 182–183, 186–187, 190, 192–193, 195, 197, 199–200, 202, 205–206, 209–210, 213–214, 217–219, 221, 224, 238–240, 242, 244, 249, 251–253, 255–256, 258–260, 265–266, 268–270, 272, 274, 276–277, 281–284, 286–287, 289–290, 292–293, 295, 298, 312–314, 317, 320, 325, 327–329, 331–333, 335–336, 339–340, 343, 345–346, 348, 350, 352–353, 355, 358–359, 365–366, 369–370, 373–374, 377–378, 386–388, 392, 395, 399, 402, 425, 438–439, 442, 464, 476–477, 480, 505, 518–519, 522, 537, 549–550, 553, 565, 577–578, 581, 596, 608–609
utils
   __init__.py40100% 
   config_formatter.py534024%29, 37–38, 40–42, 44, 55, 58–60, 62–63, 66–69, 72–74, 77–78, 80, 91, 102, 113, 127–128, 130–132, 145–147, 150, 153–154, 157–158, 160
   config_path.py76593%24–26, 117–118
   peptide.py160100% 
TOTAL146930978% 

Tests Skipped Failures Errors Time
320 0 💤 0 ❌ 0 🔥 35.884s ⏱️

@JemmaLDaniel JemmaLDaniel force-pushed the fix-fragment-match-double-counting branch from 8e9a314 to f49db77 Compare April 10, 2026 14:36
@JemmaLDaniel JemmaLDaniel force-pushed the feat-refactor-calibration-features branch from f0af123 to 3a65f83 Compare April 10, 2026 15:41
@JemmaLDaniel JemmaLDaniel force-pushed the fix-fragment-match-double-counting branch from f49db77 to 7b2daac Compare April 10, 2026 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant