Skip to content

ZFIN-10166: Marker assembly update fix#1743

Open
rtaylorzfin wants to merge 2 commits intoZFIN:mainfrom
rtaylorzfin:marker-assembly-update-fix
Open

ZFIN-10166: Marker assembly update fix#1743
rtaylorzfin wants to merge 2 commits intoZFIN:mainfrom
rtaylorzfin:marker-assembly-update-fix

Conversation

@rtaylorzfin
Copy link
Copy Markdown
Collaborator

@rtaylorzfin rtaylorzfin commented Feb 27, 2026

Improvements to GeneID matching and data consistency (markerAssemblyUpdate.sql):

Regular Expression Updates for GeneID Matching:

  • Simplified and unified the regular expressions in multiple queries to use regexp_like(..., 'GeneID:' || accession || '(,|$)'), ensuring more accurate matching of GeneIDs at the end of the string or before a comma, and removing unnecessary wildcard patterns.
  • Updated the GFF3 attribute matching to use regexp_like(gff_attributes, 'GeneID:' || db.dblink_acc_num || '(,|;|$)'), improving the specificity of the match and handling both comma and semicolon delimiters.

Table Usage and Data Consistency:

  • Modified the subquery for checking existing records to use the marker_assembly table instead of sequence_feature_chromosome_location_generated, ensuring consistency with the intended data model and preventing duplicate entries.

More Details

  • The temp_new_gene query was filtering on sfclg (location records) instead
    of marker_assembly, causing genes with existing locations but missing
    assembly records to be skipped. This required a second cleanup pass to
    catch them. Fix the upstream filter to check marker_assembly directly,
    simplify regex patterns, and remove the now-unnecessary second pass.

The temp_new_gene query was filtering on sfclg (location records) instead
of marker_assembly, causing genes with existing locations but missing
assembly records to be skipped. This required a second cleanup pass to
catch them. Fix the upstream filter to check marker_assembly directly,
simplify regex patterns, and remove the now-unnecessary second pass.
@rtaylorzfin rtaylorzfin changed the title Marker assembly update fix ZFIN-10166: Marker assembly update fix Mar 3, 2026
@rtaylorzfin rtaylorzfin reopened this Mar 5, 2026
@rtaylorzfin rtaylorzfin marked this pull request as draft March 5, 2026 18:08
@rtaylorzfin rtaylorzfin marked this pull request as ready for review March 31, 2026 22:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant