Skip to content

Svvord/scFM-eval

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

scFM-eval

scFM-eval is a unified, reproducible computational framework for deploying, running, and evaluating single-cell foundation models (scFMs).
It is built on Nextflow DSL2 and provides standardized execution, containerized environments, and automated embedding inference across multiple scFM methods.

[2026.03.04] We released the fine-tuning implementation, primarily designed for data with discrete labels.

[2026.01.13] We released the few-shot learning implementation, primarily designed for data with discrete labels, and fixed several minor bugs in scPRINT deployment.


System Requirements

  • OS: Linux (linux/amd64)
  • GPU: NVIDIA GPU required
    • NVIDIA driver β‰₯ 525
  • Container runtime:
    • Docker or
    • Apptainer (formerly Singularity)
  • Nextflow:
    • Tested with Nextflow β‰₯ 25.10.0
    • Any version supporting DSL2 should work

Installation

1. Install Nextflow

Please follow the official instructions:
πŸ‘‰ https://github.com/nextflow-io/nextflow

After installation, verify:

nextflow -v

2. Download scFM-eval

git clone https://github.com/Svvord/scFM-eval.git

First-Time Setup (Required Once)

Step 1. Choose Your Container Backend

Open nextflow.config and select one container runtime:

  • Apptainer (Default; no changes needed unless you modified it before)

  • Singularity

singularity {
    enabled = true
    ...
}
docker {
    enabled = false
    ...
}
apptainer {
    enabled = false
    ...
}
  • Docker
docker {
    enabled = true
    ...
}
apptainer {
    enabled = false
    ...
}
singularity {
    enabled = false
    ...
}

⚠️ This only needs to be done once. Subsequent runs require no further configuration.


Step 2. Download Model Checkpoints

Pretrained model weights must be downloaded once before first use.

We provide a helper script download_model_weights.nf to fetch official checkpoints and place them in the correct directory structure.

Example: Download weights for scGPT

nextflow download_model_weights.nf --method scgpt

πŸ“Œ Important notes:

  • You only need to download model weights once
  • Downloaded weights are cached locally and reused automatically
  • You may also manually place weights if you follow the same directory structure

Directory Structure Example (scGPT)

data/
└── model_weights/
    └── scGPT/
        └── scGPT_human/
  • The default scGPT version is scGPT_human
  • To specify this version explicitly in later runs:
--model "scGPT/scGPT_human"
  • If no version is specified, the framework will use the default pretrained model

Step 3. Input Data Preparation

scFM-eval accepts AnnData (.h5ad) files as the standard input format.

Required Data Format

  • Expression matrix:

    • Raw count matrix (not log-normalized)
    • Stored in adata.X
    • Must contain the full transcriptome
      • Do not subset to highly variable genes (HVGs)
  • Gene metadata (adata.var):

    • var.index should primarily use HGNC gene symbols
      • This is required for the majority of genes
    • Genes without an official HGNC symbol:
      • May use their Ensembl gene ID as a fallback identifier
      • This ensures all genes remain represented with a valid token
      • Most scFM methods rely on token-based gene matching and can accommodate this behavior
    • Required columns:
      • gene_symbol: gene identifier used by the model
        • HGNC gene symbol when available
        • Ensembl gene ID used as a fallback when no HGNC symbol exists
      • ensembl_id: corresponding Ensembl gene ID
  • Cell metadata (adata.obs):

    • Must contain a column named:
      • barcode: unique cell barcode identifier
    • In most cases, barcode can be a copy of adata.obs_names
    • All cell identifiers must be unique
      • If needed, ensure uniqueness by calling:
        adata.obs_names_make_unique()
      • Then populate:
        adata.obs["barcode"] = adata.obs_names

Data Preprocessing Policy

scFM-eval performs minimal preprocessing by design.

  • Users are expected to perform their own data quality control (QC) prior to input,
    such as:

    • Filtering low-quality cells
    • Doublet removal (optional)
  • Do NOT perform HVG selection

    • All scFM methods in this framework expect the full gene expression profile
    • Subsetting to HVGs may lead to:
      • Incompatible model inputs
      • Silent gene dropping
      • Degraded or misleading embeddings
  • Input data must preserve raw counts across the full transcriptome

Please refer to the provided example dataset:

data/demo/colon_1000.h5ad

First Run Notes (Important)

  • On the first execution of a method, Nextflow will automatically:

    • Pull the corresponding container image
    • Cache the image and model weights locally
  • This initial run may take longer

  • No additional setup is needed once caching is complete


Embedding Inference (Zero-shot)

Embedding inference can be performed with a single command.

We provide a small demo dataset:

data/demo/colon_1000.h5ad

Example Command

nextflow embed_by_scfm.nf \
  --method scgpt \
  --data data/demo/colon_1000.h5ad

Required arguments:

  • --method: scFM method name (e.g. scgpt)
  • --data: input dataset in .h5ad format

Output

Results are written to:

results/embedding/<method_name>/
  • Embeddings are stored as .h5ad files
  • The embedding matrix can be accessed via:
adata = sc.read_h5ad("results/embedding/scgpt/colon_1000.h5ad")
embeddings = adata.X

Few-shot Learning

Few-shot learning and label inference can be performed with a single command.

We provide a small demo dataset consisting of a support set and a query set:

data/demo/liver_1shot_support.h5ad
data/demo/liver_1shot_query.h5ad

Step 1: Fit Prototypes (Support Set)

To fit class prototypes, set the mode to fit and provide the support dataset.

nextflow fewshot_by_scfm.nf \
  --method scgpt \
  --mode fit \
  --support data/demo/liver_1shot_support.h5ad

This will generate a prototype file (.npz) saved to:

results/fewshot/fitted_prototypes/<method_name>/

The generated .npz file contains the fitted class prototypes derived from the support set.

Step 2: Infer Labels (Query Set)

To infer labels for a query dataset using the fitted prototypes, set the mode to infer and provide:

  • the query dataset
  • the path to the fitted prototype file
nextflow fewshot_by_scfm.nf \
  --method scgpt \
  --mode infer \
  --query data/demo/liver_1shot_query.h5ad \
  --fitted results/fewshot/fitted_prototypes/scgpt/liver_1shot_support.npz

Inference results are written to:

results/fewshot/inference/<method_name>/

One-step Fit + Inference

You can also provide both the support and query datasets in a single command, which will automatically perform prototype fitting followed by inference:

nextflow fewshot_by_scfm.nf \
  --method scgpt \
  --support data/demo/liver_1shot_support.h5ad \
  --query data/demo/liver_1shot_query.h5ad

Notes

  • Few-shot learning is designed for datasets with discrete label types
  • The support dataset must contain ground-truth labels
  • The query dataset does not require labels and will be annotated during inference
  • By default, labels are read from adata.obs['cell_type']; this can be overridden using the --label_key option

Fine-tuning

Fine-tuning and label prediction can also be performed with a single command.

In the example below, we reuse colon_1000.h5ad as the training dataset. It contains cell-type labels in adata.obs['cell_type']. We also provide colon_50.h5ad as a small test dataset.

data/demo/colon_1000.h5ad
data/demo/colon_50.h5ad

Step 1: Fine-tune Model

To fine-tune a model, set the mode to fit and provide the training dataset.

nextflow finetune_by_scfm.nf \
  --method scgpt \
  --mode fit \
  --train data/demo/colon_1000.h5ad

The fine-tuned model weights will be provided via a symlink and are saved by default to:

results/finetune/finetuned_models/<method_name>/<train_data_id>/

You can then use this fine-tuned model for label prediction.

Step 2: Predict labels

To predict labels, set the mode to pred and provide:

  • the directory containing the fine-tuned weights
  • the test dataset
nextflow finetune_by_scfm.nf \
  --method scgpt \
  --mode pred \
  --fitted results/finetune/finetuned_models/scGPT/colon_1000 \
  --test data/demo/colon_50.h5ad 

Prediction results are written to:

results/finetune/prediction/<method_name>/

One-step Fine-tune + Predict

You can also provide both the training and test datasets in a single command, which will automatically perform fine-tuning followed by prediction:

nextflow finetune_by_scfm.nf \
  --method scgpt \
  --train data/demo/colon_1000.h5ad \
  --test data/demo/colon_50.h5ad

Notes

  1. Some methods only support zero-shot embeddings. For these methods, we attach a task-agnostic post-hoc classifier, and the fine-tuning process actually optimizes this appended model. If the fine-tuned weight directory contains only a posthoc_classifier/ folder, then --fitted should point to: results/finetune/finetuned_models/<method_name>/<train_data_id>/posthoc_classifier/. CELLama is a special case: it supports fine-tuning the backbone model but does not support native prediction/training in our pipeline. Therefore, we fine-tune both the backbone and the post-hoc classifier. In this case, you should still pass the same --fitted directory as in the scGPT example above, even though it may also contain a posthoc_classifier/ folder.

  2. By default, labels are read from adata.obs["cell_type"]. Any discrete label field can be used in this workflow. To specify the label column, use --finetune_label_key.

  3. You can adjust the number of fine-tuning epochs and the training batch size depending on your GPU resources using: --finetune_epoch and --finetune_batch_size. We provide method-specific default finetune_epoch values (based on the original authors' fine-tuning recipes), so we generally do not recommend changing them unless you have a clear purpose.

Supported Methods & Environments

Method Container Model Version Notes
Cell2Sentence (C2S) housy17/c2s:latest v1.2.0 New method (zero/few-shot done; fine-tune pending)
CELLama housy17/cellama:latest v0.1.0
CellFM housy17/cellfm:latest 5054a2a
CellPLM housy17/cellplm:latest v0.1.0
Geneformer housy17/geneformer:latest v0.1.0
GenePT housy17/genept:latest 3602699
LangCell housy17/langcell:latest 69e41ef
scBERT housy17/scbert:latest v1.0.0
scCello housy17/sccello:latest 767585b
scFoundation housy17/scfoundation:latest 397631c
scGPT housy17/scgpt:latest v0.2.4
SCimilarity housy17/scsimilarity:latest v0.4.1
scPRINT housy17/scprint:latest v2.3.5
UCE housy17/uce:latest 8227a65

πŸ“Œ This table will be expanded as more models and configurations are added.


Tutorials & Documentation

A detailed tutorial covering:

  • Advanced parameters
  • Batch size and resource control
  • Few-shot workflows
  • Fine-tuning workflows
  • Benchmark evaluation

πŸ‘‰ Tutorial link: (coming soon)


Citation

If this framework or any of the tools provided here are useful for your research, please cite our work β€” it helps us a lot.

Siyu Hou, Penghui Yang, Wenjing Ma, Jade Xiaoqing Wang and Xiang Zhou (2026). A unified framework enables accessible deployment and comprehensive benchmarking of single-cell foundation models.

@article{hou2026unified,
  title = {A unified framework enables accessible deployment and comprehensive benchmarking of single-cell foundation models},
  author = {Hou, Siyu and Yang, Penghui and Ma, Wenjing and Wang, Jade Xiaoqing and Zhou, Xiang},
  year = {2026},
  publisher = {Cold Spring Harbor Laboratory},
  journal = {bioRxiv}
}

About

A reproducible Nextflow DSL2 framework for standardized deployment, execution, and evaluation of single-cell foundation models (scFMs) with containerized environments and automated embedding inference.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages