This function will compute a UMAP embedding and add it to an ArchRProject.

addUMAP(
  ArchRProj = NULL,
  reducedDims = "IterativeLSI",
  name = "UMAP",
  nNeighbors = 40,
  minDist = 0.4,
  metric = "cosine",
  dimsToUse = NULL,
  scaleDims = NULL,
  corCutOff = 0.75,
  sampleCells = NULL,
  outlierQuantile = 0.9,
  saveModel = TRUE,
  verbose = TRUE,
  seed = 1,
  force = FALSE,
  threads = 1,
  ...
)

Arguments

ArchRProj

An ArchRProject object.

reducedDims

The name of the reducedDims object (i.e. "IterativeLSI") to use from the designated ArchRProject.

name

The name for the UMAP embedding to store in the given ArchRProject object.

nNeighbors

An integer describing the number of nearest neighbors to compute a UMAP. This argument is passed to n_neighbors in uwot::umap().

minDist

A number that determines how tightly the UMAP is allowed to pack points together. This argument is passed to min_dist in uwot::umap(). For more info on this see https://jlmelville.github.io/uwot/abparams.html.

metric

A number that determines how distance is computed in the reducedDims to compute a UMAP. This argument is passed to metric in uwot::umap().

dimsToUse

A vector containing the dimensions from the reducedDims object to use in computing the embedding.

scaleDims

A boolean value that indicates whether to z-score the reduced dimensions for each cell. This is useful for minimizing the contribution of strong biases (dominating early PCs) and lowly abundant populations. However, this may lead to stronger sample-specific biases since it is over-weighting latent PCs. If set to NULL this will scale the dimensions based on the value of scaleDims when the reducedDims were originally created during dimensionality reduction. This idea was introduced by Timothy Stuart.

corCutOff

A numeric cutoff for the correlation of each dimension to the sequencing depth. If the dimension has a correlation to sequencing depth that is greater than the corCutOff, it will be excluded from analysis.

sampleCells

An integer specifying the number of cells to subsample and perform UMAP Embedding on. The remaining cells that were not subsampled will be re-projected using uwot::umap_transform to the UMAP Embedding. This enables a decrease in run time and memory but can lower the overal quality of the UMAP Embedding. Only recommended for extremely large number of cells.

outlierQuantile

A numeric (0 to 1) describing the distance quantile in the subsampled cels (see sampleCells) to use to filter poor quality re-projections. This is necessary because there are lots of outliers if undersampled significantly.

saveModel

A boolean value indicating whether or not to save the UMAP model in an RDS file for downstream usage such as projection of data into the UMAP embedding.

verbose

A boolean value that indicates whether printing UMAP output.

seed

A number to be used as the seed for random number generation. It is recommended to keep track of the seed used so that you can reproduce results downstream.

force

A boolean value that indicates whether to overwrite the relevant data in the ArchRProject object if the embedding indicated by name already exists.

threads

The number of threads to be used for parallel computing. Default set to 1 because if set to high can cause C stack usage errors.

...

Additional parameters to pass to uwot::umap()