- Ability to reorganize datasets
- Allow doing something like
reorganize(ligerObj, variable = "somethingNotDataset")and resulting in a new liger object with different ligerDataset grouping.
- Allow doing something like
- Ability to do downstream analysis on H5 data
- Pseudo-bulk should be easy because we are just aggregating cells.
- Wilcoxon might be a bit harder because ranks are calculated per gene but the H5 sparse data is column majored. Might need to find a fast on-disk transposition method.
- Improved scalability of downstream analysis and visualization
- Reduced the need of pre-calculated normalized data for performing Wilcoxon's test and producing feature expression plots
- Normalized data will only be calculated on the fly from raw data and pre-stored size factor (
obj$nUMI).
- Added
theme_axis_shortArrow()for tidy dimensional reduction plot axis theme. - Migrating to patchwork for multi-plot layouting. Mainly for the ease of alignment, legend collection, and subplot extraction.
- Added naive GSEA analysis on factor gene loading (W) to test if any known gene sets (e.g. cell cycle) is enriched in any factor. Implemented in
factorGSEA(). - Added dense data loading support for H5AD files
- Optimized obs metadata parsing for H5AD files
- Fixed ggplot2 color picking when coloring by logical value
- Fixed H5AD file layer detecting bug
- Fixed some other minor bugs
- Implemented highly efficient on-disk iNMF that scales to a million cells using slightly more time than in-memory version, requiring only laptop-level memory.
- Added 10X H5 data and H5AD loading function that loads the data into regular dgCMatrix in memory or the DelayedArray representation backed on disk, the latter is used for on-disk iNMF implementation.
- Added
selectBatchHVG()which implements another HVG selection strategy, credit to SCIB - Adding
suggestK()back with new methodology - Clarified optimal
runGOEnrich()workflow and added fold enrichment metric in the returned result - Fixed important bug in online iNMF scenario 2
- Fixed multiple problems related to ATAC analysis
- Fixed Wilcoxon rank-sum test bug when using ATAC peak counts
- Fixed gene coordinate parsing bug from BED file
- Optimized peak parsing speed
- Added
centroidAlign()for new cell factor loading alignment method - Added
plotProportionBox()for visualizing compositional analysis - Added
plotClusterGeneViolin()for visualizing gene expression in clusters - Added
plotBarcodeRank()for basic QC visualization - Added
plotPairwiseDEGHeatmap()for visualizing pairwise DEG results - Added
plotGODot()for visualizing GO enrichment results - Added
calcNMI()for evaluating clustering results against ground truth - Added
ligerToH5AD()allowing reticulate/Python free export of liger object to H5AD format. This is presented in extension source code (i.e. not loaded withlibrary(rliger)). - Added organism support in
runGeneralQC()and refined hemoglobin gene matching regex pattern. - Optimized DE test memory usage scalability for both pseudo-bulk method and wilcoxon test
- Optimized
plotProportionPie()by adding argumentcircleColors - Optimized
plotVolcano()text annotation positioning and gene highlighting logic. - Optimized visualization function additional argument documentation
- Changed
runMarkerDEG()andrunPairwiseDEG()default method from"wilcoxon"to"pseudoBulk" - Fixed
runMarkerDEG(method = "pseudobulk")bug in assigning pseudo-replicates, and optimized error/warning signaling. - Fixed bug in
calcAlignment(),subsetMemLigerDataset(),cellMeta() - Fixed bug in old version updating functions
- Fixed wrong UINMF aborting criteria
- Fixed example/test skipping criteria for non-existing dependencies
- Fixed file access issue when checking on CRAN
- Updated installed data file
system.file("extdata/ctrl.h5", "extdata/stim.h5")to be of standard 10X H5 format - Updated
quantileNorm()automatic reference selection according to #297 - Other minor fixes (including #308)
- Added
ligerDatasetclass for per-dataset information storage, with inheritance for specific modalities - Added a number of plotting functions with clear function names and useful functionality
- Added Leiden clustering method, now as default rather than Louvain
- Added pseudo-bulk DEG method
- Added DEG analysis with one-vs-rest marker detection in
runMarkerDEG()and pairwise comparison inrunPairwiseDEG() - Added gene name pattern for expression percentage QC metric
- Added native Seurat object support for the core integration workflow
- Added a documentation website built with pkgdown
- Added new iNMF variant method, consensus iNMF (c-iNMF), in
runCINMF(). Not stable. - Added GO enrichment dowsntream analysis in
runGOEnrich() - Changed
ligerobject class structure - Moved iNMF (previously
optimizeALS()), UINMF (previouslyoptimizeALS(unshared = TRUE)) and online iNMF (previouslyonline_iNMF()) implementation to new package RcppPlanc with vastly improved performance. Now wrapped inrunINMF(),runUINMF()andrunOnlineINMF()respectively, and all can be invoked withrunIntegration(). - Updated H5AD support to match up with Python anndata package 0.8.0 specs
- Renamed many function/argument names to follow camelCase style, original names are still available while deprecation warnings are issued
- Allow setting mito pattern in
getMitoProportion()#271 - Fix efficiency issue when taking the log of norm.data (e.g.
runWilcoxon) - Add runable examples to all exported functions when possible
- Fix typo in online_iNMF matrix initialization
- Adapt to Seurat5
- Other minor fixes