Skip to content

Feat isotope-corrected mass error feature#187

Open
JemmaLDaniel wants to merge 3 commits intofeat-refactor-calibration-featuresfrom
feat-isotope-corrected-mass-error-feature
Open

Feat isotope-corrected mass error feature#187
JemmaLDaniel wants to merge 3 commits intofeat-refactor-calibration-featuresfrom
feat-isotope-corrected-mass-error-feature

Conversation

@JemmaLDaniel
Copy link
Copy Markdown
Collaborator

Switch mass error feature to isotope-corrected signed ppm

Summary

Changes MassErrorFeature from computing a raw Dalton mass error to computing a signed mass error in parts per million (ppm) with isotope peak correction. This addresses two limitations of the previous implementation:

  1. Isotope peak selection -- when the instrument selects the M+1 or M+2 precursor isotope peak instead of the monoisotopic peak, the old Da error would be off by ~1.003 Da, encouraging the calibrator to penalise correct PSMs. The new implementation evaluates multiple isotope offsets (configurable via isotope_error_range, default (0, 1)) and selects the one with the smallest absolute error.

  2. Size normalisation -- a 0.01 Da error means very different things for a 500 Da peptide (20 ppm) vs a 3000 Da peptide (3.3 ppm). The ppm unit normalises for peptide size, giving the calibrator a scale-invariant error measure.

The formula follows the convention used by InstaNovo:

ppm = (mz_theoretical - (mz_measured - isotope * 1.00335 / z)) / mz_measured * 1e6

Changes

  • winnow/calibration/features/mass_error.py:
    • Constructor now accepts isotope_error_range: Tuple[int, int] = (0, 1).
    • Computation works in m/z space instead of mass space.
    • For each isotope offset, computes signed ppm error; selects the offset with the smallest absolute value.
    • Output column renamed from mass_error to mass_error_ppm.
    • Reuses CARBON_ISOTOPE_MASS_SHIFT from constants.py.
  • tests/calibration/features/test_mass_error.py -- rewritten with tests for: exact monoisotopic match (~0 ppm), formula verification, isotope correction selecting the best offset, sign preservation, invalid peptide handling, and custom residue masses.
  • docs/api/features/mass_error.md -- updated to document the new formula, isotope correction, ppm output, and isotope_error_range parameter.

@JemmaLDaniel JemmaLDaniel self-assigned this Apr 10, 2026
@JemmaLDaniel JemmaLDaniel added the enhancement New feature or request label Apr 10, 2026
@github-actions
Copy link
Copy Markdown

Coverage

Coverage Report
FileStmtsMissCoverMissing
__init__.py00100% 
data_types.py40100% 
calibration
   __init__.py00100% 
   calibration_features.py50100% 
   calibrator.py911583%69–70, 72, 106–109, 134–135, 137, 162–163, 167, 194–195
calibration/features
   __init__.py100100% 
   base.py80100% 
   beam.py470100% 
   chimeric.py77198%198
   constants.py10100% 
   fragment_match.py73198%190
   mass_error.py43295%85, 89
   retention_time.py77198%160
   sequence.py190100% 
   token_score.py37197%82
   utils.py114298%197, 267
compat
   __init__.py00100% 
   instanovo.py10640%12, 14–15, 17, 24–25
datasets
   __init__.py00100% 
   calibration_dataset.py1091784%155, 169, 171, 173, 183, 196, 249, 251–252, 258–261, 263–266
   data_loaders.py2701494%23, 189, 220–221, 414, 455, 847, 851, 900, 911, 1023–1024, 1052–1053
   interfaces.py30100% 
   psm_dataset.py250100% 
fdr
   __init__.py00100% 
   base.py581574%81, 85–86, 91, 98–99, 105, 126, 129–130, 135, 137–138, 144, 186
   database_grounded.py28196%52
   nonparametric.py25484%62, 68–69, 72
scripts
   __init__.py00100% 
   main.py1851850%8, 10–13, 16–20, 23–24, 26–28, 32, 39, 44, 47, 53, 55–56, 59, 68, 76, 79, 86, 88–90, 92, 94–99, 102, 104–105, 110, 125, 128, 135–141, 144–145, 148, 161–163, 166, 169, 174, 176–178, 180, 182–183, 186–187, 190, 192–193, 195, 197, 199–200, 202, 205–206, 209–210, 213–214, 217–219, 221, 224, 238–240, 242, 244, 249, 251–253, 255–256, 258–260, 265–266, 268–270, 272, 274, 276–277, 281–284, 286–287, 289–290, 292–293, 295, 298, 312–314, 317, 320, 325, 327–329, 331–333, 335–336, 339–340, 343, 345–346, 348, 350, 352–353, 355, 358–359, 365–366, 369–370, 373–374, 377–378, 386–388, 392, 395, 399, 402, 425, 438–439, 442, 464, 476–477, 480, 505, 518–519, 522, 537, 549–550, 553, 565, 577–578, 581, 596, 608–609
utils
   __init__.py40100% 
   config_formatter.py534024%29, 37–38, 40–42, 44, 55, 58–60, 62–63, 66–69, 72–74, 77–78, 80, 91, 102, 113, 127–128, 130–132, 145–147, 150, 153–154, 157–158, 160
   config_path.py76593%24–26, 117–118
   peptide.py160100% 
TOTAL146831078% 

Tests Skipped Failures Errors Time
321 0 💤 0 ❌ 0 🔥 36.070s ⏱️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant