vignettes/articles/LIGER_Functions.Rmd
LIGER_Functions.Rmd
THIS VIGNETTE IS UNDER ACTIVE REVISION AND IS BEING UPDAATED WITH scCustomize V3.0.0 FUNCTIONS, BUT IS NOT YET COMPLETE.
While the majority of scCustomize’s current functionality is with Seurat Objects there are a number of functions for working with LIGER objects too.
For this tutorial, I will be utilizing microglia data from Marsh et al., 2022 (Nature Neuroscience) which I have converted to LIGER object.
# Load Packages
library(ggplot2)
library(dplyr)
library(magrittr)
library(patchwork)
library(Seurat)
library(rliger)
library(scCustomize)
library(qs)
# read object
marsh_micro <- qread("assets/marsh_2020_micro.qs")
# Convert to liger
marsh_liger <- as.liger(object = marsh_micro, datasetVar = "orig.ident")
The Welch lab has recently released a major update to rliger package (v2.0.0+) which has substantial changes to object structure and major performance improvements over prior versions. The majority of liger functions in scCustomize v2.1.0 have been updated to work with old and new liger objects. However, a number of newer functions will be exclusive to rliger 2.0.0+.
This vignette will be using rliger 2.0.0.
scCustomize contains new LIGER plotting function
DimPlot_LIGER()
that both adds functionality not present in
LIGER package and provides function for plotting that is easier to use
for those more familiar with Seurat’s DimPlot()
DimPlot_LIGER(liger_object = marsh_liger)
As you can see we get plot that looks very similar to those generated
by DimPlot_scCustom()
or
Seurat::DimPlot()
.
DimPlot_LIGER
adds a number of options not present in
liger plotting and modifies naming/functionality of others to be more
in-line with Seurat for ease of use.
DimPlot_LIGER(liger_object = marsh_liger, combination = T)
User can specify different color palettes for each plot
clu_color <- ColorBlind_Pal()
sample_color <- DiscretePalette_scCustomize(num_colors = 12, palette = "alphabet")
DimPlot_LIGER(liger_object = marsh_liger, combination = T, colors_use_cluster = clu_color, colors_use_meta = sample_color)
DimPlot_LIGER
supports grouping and splitting plots by
meta data variables or cluster and will return patchwork layout
DimPlot_LIGER(liger_object = marsh_liger, group_by = "Transcription_Method", colors_use_meta = JCO_Four())
DimPlot_LIGER(liger_object = marsh_liger, group_by = "Cluster", split_by = "Transcription_Method",
label = FALSE)
DimPlot_LIGER
supports many different modifications of
cluster labels. NOTE: labeling is only supported when plotting by
cluster.
label
logical. Whether to label clusters on plot.label_size
Change label text size.label_repel
logical. Whether to repel labels.label_box
logical. Whether to place box around
label.label_color
change label text color.
DimPlot_LIGER(liger_object = marsh_liger)
DimPlot_LIGER(liger_object = marsh_liger, label = FALSE)
DimPlot_LIGER(liger_object = marsh_liger, label_repel = TRUE)
DimPlot_LIGER(liger_object = marsh_liger, label_repel = TRUE, label_box = TRUE, label_size = 3)
DimPlot_LIGER()
also supports point rasterization
utilizing scattermore
package. Axes, labels, and legends will remain in vector
form.
DimPlot_LIGER(liger_object = marsh_liger, raster = FALSE, label = FALSE)
DimPlot_LIGER(liger_object = marsh_liger, raster = TRUE, label = FALSE)
If you would like to restore ggplot2 default color scheme but
maintain the rest of the aesthetics of DimPlot_LIGER
simply
set parameter ggplot_default_colors = TRUE
.
DimPlot_LIGER(liger_object = marsh_liger, ggplot_default_colors = TRUE)
NOTE: For this example as “marsh_liger” was not originally a
LIGER analyzed object it has no factors. So I ran a basic LIGER pipeline
(with no specific parameters to demonstrate functionality of
Top_Genes_Factor
)
scCustomize contains function plotFactors_scCustom()
which expands the functionality and customizes the visualizes of LIGER’s
plotFactors()
function.
plotFactors_scCustom
returns two different plots for
each iNMF factor. First, is plot of the factor loadings for each cell,
grouped by dataset. The second is visualization of the factor loading on
the dimensionality reduction coordinates.
p1 <- returned_plots[[1]][[6]]
p2 <- returned_plots[[2]][[6]]
wrap_plots(p1, p2, ncol = 2)
plotFactors_scCustom
has a number of additional
parameters that expand the functionality of
plotFactors
.
save_plots
and/or return_plots
logical
parameters to either return plots to assigned variable in environment or
save the plots to PDF (or both).file_path
and file_name
optional
parameters required if saving plots to PDF.raster
whether or not to raster the points in both
plots. Significantly reduced PDF file size and speed of plotting.plotFactors_scCustom
has a number of additional
parameters that significantly expand the visualization options of
plotFactors
.
colors_use_factors
can be used to color the datasets in
the factor loading plot. Must be vector equal to or greater than number
of datasets in object. By default uses the “varibow” palette from
DiscretePalette_scCustom
instead of default ggplot2 hue
palette.ggplot_default_colors
parameter can be specified if you
would like to return this plot to using the ggplot2/hue palette.colors_use_dimreduc
can be used to define colors for
dimensionality reduction plot. Can either be vector or two colors to be
used to create gradient or a define gradient.order
whether or not to reorder the points so that
higher loading points are plotted on top of lower loading cells (avoids
cell hiding). Default is FALSE.Unfortunately, the liger/rliger package does not properly store the name of the dimensionality reduction technique used and labels all plots with tSNE1/tSNE2 or dr1/dr2.
plotFactors_scCustom
solves this by providing extra
parameter: reduction_label
in the function that can be used
to set labels. The first time you use plotFactors_scCustom
in R session it will print a one-time message alerting you to the fact
that this parameter exists and is set to “UMAP” by default (as it seems
to be most commonly used method currently).
Following plotting with plotFactors_scCustom
it can
sometimes be helpful to pull list of top loading genes for particular
factor. Top_Genes_Factor
provides quick function to extract
the list of genes that loads highly on particular iNMF factor.
NOTE: This function simply extracts top loading genes in order from
given factor. It does not assess whether that number of genes is
significant, etc.
top20_factor6 <- Top_Genes_Factor(liger_object = marsh_liger, liger_factor = 6, num_genes = 20)
head(top20_factor6, 5)
## [1] "Junb" "Jun" "Klf2" "Klf6" "Btg2"
For more information on adding cell QC metricss to LIGER objects see Object QC Vignette for information.
By default when selecting highly variable genes in LIGER using
rliger::selectGenes()
LIGER performs gene selection across
all datasets in the object and then can either take the union or
intersection of those lists (default is union). However, sometimes for
analysis you may want to select highly variable genes from all data
combined.
scCustomize provides the function
Variable_Features_ALL_LIGER()
that will do just that and
returns the variable genes list to the @var.genes
slot of
original LIGER object.
NOTE: This function creates a temporary LIGER object by merging all
of the matrices present in original object and then running variable
gene selection. Therefore, while running the function it will use
significant memory resources (equivalent to size of original
object).
# Select highly variable genes based on threshold
marsh_liger <- Variable_Features_ALL_LIGER(liger_object = marsh_liger, var.thresh = 0.3)
# Select based on specific number of variable genes
marsh_liger <- Variable_Features_ALL_LIGER(liger_object = marsh_liger, num_genes = 2000)
As discussed in helpers
vignette, the Gene_Present()
function can be used to
check if list of genes are present in a LIGER object.
By default this will query all datasets in the object to create unified gene list and check input gene list against that list.
gene_input_list <- c("P2ry12", "Fcrls", "Aif1")
genes_present <- Feature_Present(data = marsh_liger, features = gene_input_list)
However you can check individual datasets within object by supplying
those matrices to the data
parameter.
gene_input_list <- c("P2ry12", "Fcrls", "Aif1")
genes_present <- Feature_Present(data = marsh_liger@datasets[[1]], features = gene_input_list)