Description of feature
Is your feature request related to a problem?
The fastquorum pipeline currently supports duplex sequencing (via CallDuplexConsensusReads) and simplex/standard UMI sequencing (via CallMolecularConsensusReads). However, it does not support CODEC (Concatenating Original Duplex for Error Correction) sequencing, a method that provides 1,000-fold higher accuracy than standard NGS by physically linking Watson and Crick strands (https://pubmed.ncbi.nlm.nih.gov/37106072/).
fgbio v3.0+ includes CallCodecConsensusReads specifically for CODEC data, but fastquorum does not yet utilize this tool.
Describe the solution you'd like
Add CODEC sequencing support to fastquorum, including:
- New consensus calling pathway: Add conditional logic to invoke fgbio CallCodecConsensusReads when --protocol codec (or similar parameter) is
specified
- Read structure support: Document the expected CODEC read structure (typically 3M2S+T 3M2S+T - 3bp UMI, 2bp skip, template for both reads)
- GroupReadsByUmi configuration: CODEC requires --strategy adjacency or --strategy identity (NOT paired) per the
http://fulcrumgenomics.github.io/fgbio/tools/latest/CallCodecConsensusReads.html
- CODEC-specific metrics: Consider adding appropriate QC metrics for CODEC consensus generation
- Test data: Add CODEC test data to nf-core/test-datasets (may need to be synthetic/simulated given data availability constraints)
Implementation notes
Additional context
CODEC sequencing is gaining adoption for high-accuracy applications including:
- Rare mutation detection
- Liquid biopsy analysis
- Microsatellite instability (MSI) detection
- Clonal hematopoiesis analysis
Related resources:
Sources:
Description of feature
Is your feature request related to a problem?
The fastquorum pipeline currently supports duplex sequencing (via CallDuplexConsensusReads) and simplex/standard UMI sequencing (via CallMolecularConsensusReads). However, it does not support CODEC (Concatenating Original Duplex for Error Correction) sequencing, a method that provides 1,000-fold higher accuracy than standard NGS by physically linking Watson and Crick strands (https://pubmed.ncbi.nlm.nih.gov/37106072/).
fgbio v3.0+ includes CallCodecConsensusReads specifically for CODEC data, but fastquorum does not yet utilize this tool.
Describe the solution you'd like
Add CODEC sequencing support to fastquorum, including:
specified
http://fulcrumgenomics.github.io/fgbio/tools/latest/CallCodecConsensusReads.html
Implementation notes
Additional context
CODEC sequencing is gaining adoption for high-accuracy applications including:
Related resources:
Sources: