Skip to content

ZFIN-10010: Fix duplicate genome location rows on mapping detail page#1803

Open
rtaylorzfin wants to merge 3 commits intoZFIN:mainfrom
rtaylorzfin:zfin-10010
Open

ZFIN-10010: Fix duplicate genome location rows on mapping detail page#1803
rtaylorzfin wants to merge 3 commits intoZFIN:mainfrom
rtaylorzfin:zfin-10010

Conversation

@rtaylorzfin
Copy link
Copy Markdown
Collaborator

The mapping detail page showed duplicate rows in the genome browser table due to two issues:

  1. markerAssemblyUpdate.sql inserted NCBILoader and ZFIN rows without checking for existing records, creating true duplicates across runs. Added NOT EXISTS guards to both inserts.

  2. The UCSC loader creates one row per RefSeq accession per gene, which appear identical on-screen since accession isn't displayed. Added deduplication by (entity, chromosome, start, end, source, assembly) in MappingService.sortAndFilterGenomeBrowserLocations().

Includes a Liquibase cleanup script to remove existing NCBILoader and ZFIN duplicate rows.

The mapping detail page showed duplicate rows in the genome browser
table due to two issues:

1. markerAssemblyUpdate.sql inserted NCBILoader and ZFIN rows without
   checking for existing records, creating true duplicates across runs.
   Added NOT EXISTS guards to both inserts.

2. The UCSC loader creates one row per RefSeq accession per gene, which
   appear identical on-screen since accession isn't displayed. Added
   deduplication by (entity, chromosome, start, end, source, assembly)
   in MappingService.sortAndFilterGenomeBrowserLocations().

Includes a Liquibase cleanup script to remove existing NCBILoader and
ZFIN duplicate rows.
Add NOT EXISTS guard to NCBIStartEnd.sql add path to prevent inserting
rows when the gene/accession already has an NCBIStartEndLoader record.

Expand cleanup script to also remove NCBIStartEndLoader duplicates, and
group by all non-PK columns to ensure only truly identical rows are
removed.
@rtaylorzfin
Copy link
Copy Markdown
Collaborator Author

Can we add a uniqueness constraint here?

@rtaylorzfin rtaylorzfin marked this pull request as ready for review April 7, 2026 00:22
…me_location_generated

Clean up duplicate rows in sequence_feature_chromosome_location_generated and add a unique constraint to prevent future duplicates.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant