Skip to content

pileup reloads BAI index #1377

@suhrig

Description

@suhrig

Hi,

I use the pileup() function in a loop to generate pileups over a number of genomic ranges. pysam appears to reload the BAM index for each invocation of the pileup() function even though I'm using the same AlignmentFile object. Is this intended behavior? I creates considerable overhead.

Steps to reproduce:

  1. Create a minimal example python script:
#!/usr/bin/env python3
import pysam
bam = pysam.AlignmentFile("input.bam", "rb")
for i in range(100):
    for pileupcolumn in bam.pileup("1", 1000, 1010):
        pass
bam.close()
  1. Run the script with strace to log file access:
strace -o strace.log ./example.py
  1. Count how often it opened the BAI file:
grep -c 'openat.*[.]bai' strace.log

Result: 101 times

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions