scFocus is a reinforcement-learning-based method for analyzing lineage branching in low-dimensional single-cell embeddings. It combines branch probabilities with unsupervised structure in the latent space to help characterize continuous cell-state transitions and related visualization workflows.
- Python >= 3.9
- Required packages:
scanpy>=1.10.4,torch>=1.13.1,joblib>=1.2.0,tqdm>=4.64.1,streamlit>=1.24.0
pip install scfocusgit clone https://github.com/PeterPonyu/scfocus.git
cd scfocus
pip install -e .Basic example to get started with scFocus:
import scanpy as sc
import scfocus
# Load your single-cell data
adata = sc.read_h5ad('your_data.h5ad')
# Preprocess: normalize, log-transform, and compute PCA
sc.pp.normalize_total(adata, target_sum=1e4)
sc.pp.log1p(adata)
sc.pp.highly_variable_genes(adata, n_top_genes=2000)
adata = adata[:, adata.var.highly_variable]
sc.pp.pca(adata)
# Compute UMAP embedding
sc.pp.neighbors(adata, n_neighbors=15)
sc.tl.umap(adata)
# Run scFocus analysis
embedding = adata.obsm['X_umap']
focus = scfocus.focus(embedding, n=6, pct_samples=0.01)
focus.meta_focusing(n=3)
focus.merge_fp2()
# Add focus probabilities to your AnnData object
adata.obsm['focus_probs'] = focus.mfp[0]
for i in range(focus.mfp[0].shape[1]):
adata.obs[f'Fate_{i}'] = focus.mfp[0][:, i]
# Visualize results
sc.pl.umap(adata, color=[f'Fate_{i}' for i in range(focus.mfp[0].shape[1])])The focus class accepts the following key parameters:
- f (array-like): Latent space of the original data (e.g., UMAP or t-SNE coordinates)
- n (int, default=8): Number of parallel agents/branches to identify
- pct_samples (float, default=0.125): Percentage of samples used in each training step
- max_steps (int, default=5): Maximum steps per training episode
- num_episodes (int, default=1000): Number of training episodes
- hidden_dim (int, default=128): Hidden layer dimension for neural networks
- res (float, default=0.05): Resolution for merging similar focus patterns
For a complete list of parameters and their descriptions, see the API documentation.
Tutorials and API documentation are available at https://scfocus.readthedocs.io/en/latest/, including:
- Notebooks for different datasets
- Step-by-step tutorials
- API reference
scFocus provides an interactive web interface for data analysis.
Access the hosted version at scfocus.streamlit.app.
Launch the local web interface:
scfocus ui- Upload Data: Support for
.h5adfiles or 10x Genomics format (matrix.mtx, features.tsv, barcodes.tsv) - Configure Parameters:
- Number of highly variable genes (200-5000, default: 2000)
- Number of neighbors for UMAP (2-50, default: 15)
- Minimum distance for UMAP (0.0-2.0, default: 0.5)
- Number of branches (2-10, default: 6)
- Process: Click "Process" to run the analysis pipeline
- Visualize: View UMAP plots colored by cell fate probabilities
- Download: Export processed data as
.h5adfile
Example datasets are available in the data/ folder of the repository.
# Launch web interface
scfocus ui
# Additional CLI commands may be added in future releasesThe typical scFocus workflow consists of:
- Preprocessing: Normalize and log-transform the data, select highly variable genes
- Dimensionality Reduction: Compute PCA and UMAP/t-SNE embeddings
- scFocus Analysis: Apply reinforcement learning to identify lineage branches
- Merge Patterns: Consolidate similar focus patterns
- Visualization: Display cell fate probabilities and branch assignments
Issue: ModuleNotFoundError: No module named 'torch'
- Solution: Install PyTorch:
pip install torch>=1.13.1
Issue: CUDA out of memory error
- Solution: The algorithm automatically uses CPU if GPU is unavailable. For large datasets, consider reducing
n(number of agents) orpct_samples.
Issue: Streamlit command not found
- Solution: Install streamlit:
pip install streamlit>=1.24.0
Issue: Analysis is slow
- Solution:
- Reduce
num_episodes(default 1000) for faster results - Decrease
n(number of agents) to reduce computational load - Use GPU if available
- Reduce
- Check the documentation
- Open an issue on GitHub
- Review example notebooks in the documentation
# Clone the repository
git clone https://github.com/PeterPonyu/scfocus.git
cd scfocus
# Install in development mode
pip install -e .
# Install additional development dependencies
pip install -r requirements.txtThe documentation is built using Sphinx:
# Install Sphinx and dependencies
pip install sphinx sphinx-rtd-theme
# Build the documentation
cd source
make htmlThe documentation will be built in build/html/.
Chen, C., Fu, Z., Yang, J., Chen, H., Huang, J., Qin, S., Wang, C., & Hu, X. (2025). scFocus: Detecting Branching Probabilities in Single-cell Data with SAC. Computational and Structural Biotechnology Journal. https://doi.org/10.1016/j.csbj.2025.04.036
