Get a randomly downsampled set of cell barcodes with even numbers of cells for each identity class. Can return either as a list (1 entry per identity class) or vector of barcodes.
Random_Cells_Downsample(
seurat_object,
num_cells,
group.by = NULL,
return_list = FALSE,
allow_lower = FALSE,
seed = 123
)
Seurat object
number of cells per ident to use in down-sampling. This value must be less than or equal to the size of ident with fewest cells. Alternatively, can set to "min" which will use the maximum number of barcodes based on size of smallest group.
The ident to use to group cells. Default is NULL which use current active.ident. .
logical, whether or not to return the results as list instead of vector, default is FALSE.
logical, if number of cells in identity is lower than num_cells
keep the
maximum number of cells, default is FALSE. If FALSE will report error message if num_cells
is
too high, if TRUE will subset cells with more than num_cells
to that value and those with less
than num_cells
will not be downsampled.
random seed to use for downsampling. Default is 123.
either a vector or list of cell barcodes
library(Seurat)
# return vector of barcodes
random_cells <- Random_Cells_Downsample(seurat_object = pbmc_small, num_cells = 10)
head(random_cells)
#> [1] "GTCATACTTCGCCT" "GCGCATCTTGCTCC" "TACTCTGAATCGAC" "GAACCTGATGAACC"
#> [5] "AATGTTGACAGTCA" "GGCATATGCTTATC"
# return list
random_cells_list <- Random_Cells_Downsample(seurat_object = pbmc_small, return_list = TRUE,
num_cells = 10)
head(random_cells_list)
#> [[1]]
#> [1] "GTCATACTTCGCCT" "GCGCATCTTGCTCC" "TACTCTGAATCGAC" "GAACCTGATGAACC"
#> [5] "AATGTTGACAGTCA" "GGCATATGCTTATC" "CTTCATGACCGAAT" "CTAAACCTCTGACA"
#> [9] "AGTCAGACTGCACA" "TAGGGACTGAACTC"
#>
#> [[2]]
#> [1] "AAGCGACTTTGACG" "ATTGTAGATTCCCG" "ATTACCTGCCTTAT" "ATACCACTCTAAGC"
#> [5] "TCCACTCTGAGCTT" "GCGCACGACTTTAC" "CCATCCGATTCGCC" "AAATTCGAATCACG"
#> [9] "CATCAGGATGCACA" "ACGTGATGCCATGA"
#>
#> [[3]]
#> [1] "GCGTAAACACGGTT" "CATGAGACACGGGA" "TTGCATTGAGCTAC" "AGGTCATGAGTGTC"
#> [5] "GATAGAGAAGGGTG" "GTAAGCACTCATTC" "TACGCCACTCCGAA" "GCTCCATGAGAAGT"
#> [9] "TACATCACGCTAAC" "CATATAGACTAAGC"
#>
# return max total number of cells (setting `num_cells = "min`)
random_cells_max <- Random_Cells_Downsample(seurat_object = pbmc_small, num_cells = "min")
#> The number of cells was set to "min", returning 19 cells per identity class
#> (equal to size of smallest identity class(es): "2").
#> ✔ Total of 57 cells across whole object.