This function computes imputations weights that describe each cell as a linear combination of many cells based on a MAGIC diffusion matrix.

addImputeWeights(
  ArchRProj = NULL,
  reducedDims = "IterativeLSI",
  dimsToUse = NULL,
  scaleDims = NULL,
  corCutOff = 0.75,
  td = 3,
  ka = 4,
  sampleCells = 5000,
  nRep = 2,
  k = 15,
  epsilon = 1,
  useHdf5 = TRUE,
  randomSuffix = FALSE,
  threads = getArchRThreads(),
  seed = 1,
  verbose = TRUE,
  logFile = createLogFile("addImputeWeights")
)

Arguments

ArchRProj

An ArchRProject object.

reducedDims

The name of the reducedDims object (i.e. "IterativeLSI") to retrieve from the designated ArchRProject.

dimsToUse

A vector containing the dimensions from the reducedDims object to use.

scaleDims

A boolean that indicates whether to z-score the reduced dimensions for each cell. This is useful forminimizing the contribution of strong biases (dominating early PCs) and lowly abundant populations. However, this may lead to stronger sample-specific biases since it is over-weighting latent PCs. If set to NULL this will scale the dimensions based on the value of scaleDims when the reducedDims were originally created during dimensionality reduction. This idea was introduced by Timothy Stuart.

corCutOff

A numeric cutoff for the correlation of each dimension to the sequencing depth. If the dimension has a correlation to sequencing depth that is greater than the corCutOff, it will be excluded.

td

The diffusion time parameter determines the number of smoothing iterations to be performed (see MAGIC from van Dijk et al Cell 2018).

ka

The k-nearest neighbors autotune parameter to equalize the effective number of neighbors for each cell, thereby diminishing the effect of differences in density. (see MAGIC from van Dijk et al Cell 2018).

sampleCells

The number of cells to sub-sample to compute an imputation block. An imputation block is a cell x cell matrix that describes the linear combination for imputation for numerical values within these cells. ArchR creates many blocks to keep this cell x cell matrix sparse for memory concerns.

nRep

An integer representing the number of imputation replicates to create when downsampling extremely low.

k

The number of nearest neighbors for smoothing to use for MAGIC (see MAGIC from van Dijk et al Cell 2018).

epsilon

The value for the standard deviation of the kernel for MAGIC (see MAGIC from van Dijk et al Cell 2018).

useHdf5

A boolean value that indicates whether HDF5 format should be used to store the impute weights.

randomSuffix

A boolean value that indicates whether a random suffix should be appended to the saved imputation weights hdf5 files.

threads

The number of threads to be used for parallel computing.

seed

A number to be used as the seed for random number generation. It is recommended to keep track of the seed used so that you can reproduce results downstream.

verbose

A boolean value indicating whether to use verbose output during execution of this function. Can be set to FALSE for a cleaner output.

logFile

The path to a file to be used for logging ArchR output.