-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Description
See also here
The following code from ctxcore leads to the perverse situation where setting auc_threshold to 1 will lead to an assertion error: if auc_threshold were 1, rank_cutoff will be equal to total_genes, but rank_threshold has been set to total_genes - 1. So rank_threshold can at most be total_genes - 1, but then in the final line, rank_cutoff is decremented by `. It seems that there was a double attempt to fix the 0- vs 1-indexing discrepancy between python and R, leading to an off-by-one error...
def derive_rank_cutoff(
auc_threshold: float, total_genes: int, rank_threshold: Optional[int] = None
) -> int:
"""
Get rank cutoff.
:param auc_threshold: The fraction of the ranked genome to take into account for
the calculation of the Area Under the recovery Curve.
:param total_genes: The total number of genes ranked.
:param rank_threshold: The total number of ranked genes to take into account when
creating a recovery curve.
:return Rank cutoff.
"""
if not rank_threshold:
rank_threshold = total_genes - 1
assert (
0 < rank_threshold < total_genes
), f"Rank threshold must be an integer between 1 and {total_genes:d}."
assert (
0.0 < auc_threshold <= 1.0
), "AUC threshold must be a fraction between 0.0 and 1.0."
# In the R implementation the cutoff is rounded.
rank_cutoff = int(round(auc_threshold * total_genes))
assert 0 < rank_cutoff <= rank_threshold, (
f"An AUC threshold of {auc_threshold:f} corresponds to {rank_cutoff:d} top "
f"ranked genes/regions in the database. Please increase the rank threshold "
"or decrease the AUC threshold."
)
# Make sure we have exactly the same AUC values as the R-SCENIC pipeline.
# In the latter the rank threshold is not included in AUC calculation.
rank_cutoff -= 1
return rank_cutoff
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels