Get a randomly downsampled set of cell barcodes with even numbers of cells for each identity class. Can return either as a list (1 entry per identity class) or vector of barcodes.

Random_Cells_Downsample(
  seurat_object,
  num_cells,
  group.by = NULL,
  return_list = FALSE,
  allow_lower = FALSE,
  seed = 123
)

Arguments

seurat_object

Seurat object

num_cells

number of cells per ident to use in down-sampling. This value must be less than or equal to the size of ident with fewest cells. Alternatively, can set to "min" which will use the maximum number of barcodes based on size of smallest group.

group.by

The ident to use to group cells. Default is NULL which use current active.ident. .

return_list

logical, whether or not to return the results as list instead of vector, default is FALSE.

allow_lower

logical, if number of cells in identity is lower than num_cells keep the maximum number of cells, default is FALSE. If FALSE will report error message if num_cells is too high, if TRUE will subset cells with more than num_cells to that value and those with less than num_cells will not be downsampled.

seed

random seed to use for downsampling. Default is 123.

Value

either a vector or list of cell barcodes

Examples

library(Seurat)

# return vector of barcodes
random_cells <- Random_Cells_Downsample(seurat_object = pbmc_small, num_cells = 10)
head(random_cells)
#> [1] "GTCATACTTCGCCT" "GCGCATCTTGCTCC" "TACTCTGAATCGAC" "GAACCTGATGAACC"
#> [5] "AATGTTGACAGTCA" "GGCATATGCTTATC"

# return list
random_cells_list <- Random_Cells_Downsample(seurat_object = pbmc_small, return_list = TRUE,
num_cells = 10)
head(random_cells_list)
#> [[1]]
#>  [1] "GTCATACTTCGCCT" "GCGCATCTTGCTCC" "TACTCTGAATCGAC" "GAACCTGATGAACC"
#>  [5] "AATGTTGACAGTCA" "GGCATATGCTTATC" "CTTCATGACCGAAT" "CTAAACCTCTGACA"
#>  [9] "AGTCAGACTGCACA" "TAGGGACTGAACTC"
#> 
#> [[2]]
#>  [1] "AAGCGACTTTGACG" "ATTGTAGATTCCCG" "ATTACCTGCCTTAT" "ATACCACTCTAAGC"
#>  [5] "TCCACTCTGAGCTT" "GCGCACGACTTTAC" "CCATCCGATTCGCC" "AAATTCGAATCACG"
#>  [9] "CATCAGGATGCACA" "ACGTGATGCCATGA"
#> 
#> [[3]]
#>  [1] "GCGTAAACACGGTT" "CATGAGACACGGGA" "TTGCATTGAGCTAC" "AGGTCATGAGTGTC"
#>  [5] "GATAGAGAAGGGTG" "GTAAGCACTCATTC" "TACGCCACTCCGAA" "GCTCCATGAGAAGT"
#>  [9] "TACATCACGCTAAC" "CATATAGACTAAGC"
#> 

# return max total number of cells (setting `num_cells = "min`)
random_cells_max <- Random_Cells_Downsample(seurat_object = pbmc_small, num_cells = "min")
#> The number of cells was set to "min", returning 19 cells per identity class
#> (equal to size of smallest identity class(es): "2").
#>  Total of 57 cells across whole object.